From auriocus at gmx.de Sat Jul 1 01:48:12 2017 From: auriocus at gmx.de (Christian Gollwitzer) Date: Sat, 1 Jul 2017 07:48:12 +0200 Subject: Teaching the "range" function in Python 3 In-Reply-To: References: Message-ID: Am 30.06.17 um 04:33 schrieb Rick Johnson: > And to further drive home the point, you can manually insert > a list literal to prove this: > > >>> range(10) > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > >>> for value in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]: > ... print(value) > ... > 0 > 1 Now you have exactly missed the point that the OP was asking about. In Python 2, yes, this works and it is the way he has teached it to his students. Howver, in Python 3: >>> range(10) range(0, 10) This is not helpful to understand what range does, and this is the original question. Christian From dieter at handshake.de Sat Jul 1 02:26:10 2017 From: dieter at handshake.de (dieter) Date: Sat, 01 Jul 2017 08:26:10 +0200 Subject: Fwd: ftplib sending data out of order References: Message-ID: <87r2y0oepp.fsf@handshake.de> Charles Wilt writes: > ... > First off, I'm not a python guy....but I use a set of python scripts > created a few years ago by somebody else to transfer source between the SVN > repo on my PC and an IBM i (aka AS/400) system. > > Recently, multiple developers, including me, have started having > intermittent issues whereby the source gets sometimes gets scrambled during > the send from PC to IBM i; and it seems to be happening more and more > often. > > The python script hasn't been changed, and I've been running 2.7.12; though > I've since tried upgrading to 2.7.13. > > I used wireshark to capture the FTP session and have confirmed that the > data is leaving the PC in the incorrect order. (screenshot attached) The Python level is not concerned with the order of data packets. In your case, it reads a block of data from the file and hands it over to the communication channel. This channel is responsible to split the data into packets, send them and reassemble them in the correct order at the receiving side. I see only two situations where the problem could reside at the Python level -- both require multiple threads: * the file object is used concurrently by different threads (and thus, the thread doing the FTP does not see the file content in order) * two threads are using the same FTP connection concurrently. It is unlikely that your case falls into such a situation. To prove that the Python level is innocent, I would instrument the "ftplib" code to protocol what is read from the file and what it handed over to the communication channel. From ozovozovozo202 at gmail.com Sat Jul 1 03:12:49 2017 From: ozovozovozo202 at gmail.com (Debiller 777) Date: Sat, 1 Jul 2017 10:12:49 +0300 Subject: Python installer In-Reply-To: References: Message-ID: Thanks a lot. 1 ???? 2017 ?. 0:16 ???????????? "eryk sun" ???????: > On Fri, Jun 30, 2017 at 8:30 PM, Debiller 777 > wrote: > > I just get error that there is no module name 'encodings' > > First make sure that neither PYTHONHOME nor PYTHONPATH are defined in > your environment. To check this type `set python` in a command prompt. > Neither variable should be listed. > From sjeik_appie at hotmail.com Sat Jul 1 04:29:39 2017 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Sat, 1 Jul 2017 08:29:39 +0000 Subject: Combining 2 data series into one In-Reply-To: References: <0c051266-a672-44e2-982f-2b69376775b2@googlegroups.com> <3ea981f5-74fb-4f4c-b126-4b5f31c245ba@googlegroups.com> ,> , Message-ID: Hi, Does your code run on a sample of the data? Does your code have categorical data in it? If so: https://pandas.pydata.org/pandas-docs/stable/categorical.html. Also, check out http://www.pytables.org. Albert-Jan ________________________________ From: Python-list on behalf of Bhaskar Dhariyal Sent: Thursday, June 29, 2017 4:34:56 AM To: python-list at python.org Subject: Re: Combining 2 data series into one On Wednesday, 28 June 2017 23:43:57 UTC+5:30, Albert-Jan Roskam wrote: > (sorry for top posting) > Yes, I'd try pd.concat([df1, df2]). > Or this: > df['both_names'] = df.apply(lambda row: row.name + ' ' + row.surname, axis=1) > ________________________________ > From: Python-list on behalf of Paul Barry > Sent: Wednesday, June 28, 2017 12:30:25 PM > To: Bhaskar Dhariyal > Cc: python-list at python.org > Subject: Re: Combining 2 data series into one > > Maybe look at using .concat instead of + > > See: > http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.06-Concat-And-Append.ipynb > > On 28 June 2017 at 13:02, Paul Barry wrote: > > > > > Maybe try your code on a sub-set of your data - perhaps 1000 lines of > > data? - to see if that works. > > > > Anyone else on the list suggest anything to try here? > > > > On 28 June 2017 at 12:50, Bhaskar Dhariyal > > wrote: > > > >> No it didn't work. I am getting memory error. Using 32GB RAM system > >> > >> On Wed, Jun 28, 2017 at 5:17 PM, Paul Barry > >> wrote: > >> > >>> On the line that's failing, your code is this: > >>> > >>> combinedX=combinedX+dframe['tf'] > >>> > >>> which uses combinedX on both sides of the assignment statement - note > >>> that Python is reporting a 'MemoryError", which may be happening due to > >>> this "double use" (maybe). What happens if you create a new dataframe, > >>> like this: > >>> > >>> newX = combinedX + dframe['tf'] > >>> > >>> Regardless, it looks like you are doing a dataframe merge. Jake V's > >>> book has an excellent section on it here: http://nbviewer.jupyter. > >>> org/github/jakevdp/PythonDataScienceHandbook/blob/master/not > >>> ebooks/03.07-Merge-and-Join.ipynb - this should take about 20 minutes > >>> to read, and may be of use to you. > >>> > >>> Paul. > >>> > >>> > >>> > >>> On 28 June 2017 at 12:19, Bhaskar Dhariyal > >>> wrote: > >>> > >>>> On Wednesday, 28 June 2017 14:43:48 UTC+5:30, Paul Barry wrote: > >>>> > This should do it: > >>>> > > >>>> > >>> import pandas as pd > >>>> > >>> > >>>> > >>> df1 = pd.DataFrame(['bhaskar', 'Rohit'], columns=['first_name']) > >>>> > >>> df1 > >>>> > first_name > >>>> > 0 bhaskar > >>>> > 1 Rohit > >>>> > >>> df2 = pd.DataFrame(['dhariyal', 'Gavval'], columns=['last_name']) > >>>> > >>> df2 > >>>> > last_name > >>>> > 0 dhariyal > >>>> > 1 Gavval > >>>> > >>> df = pd.DataFrame() > >>>> > >>> df['name'] = df1['first_name'] + ' ' + df2['last_name'] > >>>> > >>> df > >>>> > name > >>>> > 0 bhaskar dhariyal > >>>> > 1 Rohit Gavval > >>>> > >>> > >>>> > > >>>> > Again, I draw your attention to Jake VanderPlas's excellent book, > >>>> which is > >>>> > available for free on the web. All of these kind of data > >>>> manipulations are > >>>> > covered there: https://github.com/jakevdp/PythonDataScienceHandbook > >>>> - the > >>>> > hard copy is worth owning too (if you plan to do a lot of work using > >>>> > numpy/pandas). > >>>> > > >>>> > I'd also recommend the upcoming 2nd edition of Wes McKinney's "Python > >>>> for > >>>> > Data Analysis" book - I've just finished tech reviewing it for > >>>> O'Reilly, > >>>> > and it is very good, too - highly recommended. > >>>> > > >>>> > Regards. > >>>> > > >>>> > Paul. > >>>> > > >>>> > On 28 June 2017 at 07:11, Bhaskar Dhariyal >>>> > > >>>> > wrote: > >>>> > > >>>> > > Hi! > >>>> > > > >>>> > > I have 2 dataframe i.e. df1['first_name'] and df2['last_name']. I > >>>> want to > >>>> > > make it as df['name']. How to do it using pandas dataframe. > >>>> > > > >>>> > > first_name > >>>> > > ---------- > >>>> > > bhaskar > >>>> > > Rohit > >>>> > > > >>>> > > > >>>> > > last_name > >>>> > > ----------- > >>>> > > dhariyal > >>>> > > Gavval > >>>> > > > >>>> > > should appear as > >>>> > > > >>>> > > name > >>>> > > ---------- > >>>> > > bhaskar dhariyal > >>>> > > Rohit Gavval > >>>> > > > >>>> > > > >>>> > > > >>>> > > Thanks > >>>> > > -- > >>>> > > https://mail.python.org/mailman/listinfo/python-list > >>>> > > > >>>> > > >>>> > > >>>> > > >>>> > -- > >>>> > Paul Barry, t: @barrypj - w: > >>>> > http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > >>>> > Lecturer, Computer Networking: Institute of Technology, Carlow, > >>>> Ireland. > >>>> > >>>> https://drive.google.com/open?id=0Bw2Avni0DUa3aFJKdC1Xd2trM2c > >>>> link to code > >>>> -- > >>>> https://mail.python.org/mailman/listinfo/python-list > >>>> > >>> > >>> > >>> > >>> -- > >>> Paul Barry, t: @barrypj - w: > >>> http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > >>> Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > >>> > >> > >> > > > > > > -- > > Paul Barry, t: @barrypj - w: > > http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > > Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > > > > > > -- > Paul Barry, t: @barrypj - w: > http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > -- > https://mail.python.org/mailman/listinfo/python-list Hi Albert! Thanks for replying. That issue was resolved. But I m struck with a new problem. I generated tfidf representation for pandas dataframe where each row contains some text. I also had some numerical feature which I wanted to combine with tfidf matrix. But this is giving memory error. -- https://mail.python.org/mailman/listinfo/python-list From jobmattcon at gmail.com Sat Jul 1 04:55:35 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sat, 1 Jul 2017 01:55:35 -0700 (PDT) Subject: how to make this situation return this result? Message-ID: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> expect result as this first case ii = 0 jj = 0 for ii in range(0,3): for jj in range(0,3): if ii < jj: print (ii, jj) but below is different as sometimes the situation is not range(0,3), but it a a list of tuple iiii = 0 jjjj = 0 for ii in range(0,3): for jj in range(0,3): if iiii < jjjj: print (iiii, jjjj) jjjj = jjjj + 1 iiii = iiii + 1 how to make this situation return result like the first case? From jobmattcon at gmail.com Sat Jul 1 05:05:34 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sat, 1 Jul 2017 02:05:34 -0700 (PDT) Subject: how to make this situation return this result? In-Reply-To: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> Message-ID: <99e7ddfb-d9f1-43de-8b87-53e2751732db@googlegroups.com> sorry for typo, iiii = 0 jjjj = 0 for ii in range(0,3): for jj in range(0,3): if iiii < jjjj: print (ii, jj) <----- correct here jjjj = jjjj + 1 iiii = iiii + 1 On Saturday, July 1, 2017 at 4:55:59 PM UTC+8, Ho Yeung Lee wrote: > expect result as this first case > > ii = 0 > jj = 0 > for ii in range(0,3): > for jj in range(0,3): > if ii < jj: > print (ii, jj) > > > but below is different > as sometimes the situation is not range(0,3), but it a a list of tuple > > iiii = 0 > jjjj = 0 > for ii in range(0,3): > for jj in range(0,3): > if iiii < jjjj: > print (iiii, jjjj) > jjjj = jjjj + 1 > iiii = iiii + 1 > > how to make this situation return result like the first case? From jobmattcon at gmail.com Sat Jul 1 05:55:01 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sat, 1 Jul 2017 02:55:01 -0700 (PDT) Subject: how to make this situation return this result? In-Reply-To: <99e7ddfb-d9f1-43de-8b87-53e2751732db@googlegroups.com> References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <99e7ddfb-d9f1-43de-8b87-53e2751732db@googlegroups.com> Message-ID: i got an idea with below, but it can not compile for ii, yy in range(2,5), range(0,3): for jj, zz in range(2,5), range(0,3): if yy < zz: print (ii, jj) real situation groupkey {(0, 1): [[0, 1], [0, 2], [0, 8]], (1, 2): [[1, 5], [1, 9], [2, 6], [2, 10], [8, 9], [8, 10]], (2, 3): [[5, 7], [6, 7], [6, 14], [10, 14]]} ii = 0 jj = 0 for groupkeya in groupkey: for groupkeyb in groupkey: if ii < jj: print "groupkeya" print groupkeya, ii, jj print "groupkeyb" print groupkeyb, ii, jj jj = jj + 1 ii = ii + 1 On Saturday, July 1, 2017 at 5:05:48 PM UTC+8, Ho Yeung Lee wrote: > sorry for typo, > > iiii = 0 > jjjj = 0 > for ii in range(0,3): > for jj in range(0,3): > if iiii < jjjj: > print (ii, jj) <----- correct here > jjjj = jjjj + 1 > iiii = iiii + 1 > > > On Saturday, July 1, 2017 at 4:55:59 PM UTC+8, Ho Yeung Lee wrote: > > expect result as this first case > > > > ii = 0 > > jj = 0 > > for ii in range(0,3): > > for jj in range(0,3): > > if ii < jj: > > print (ii, jj) > > > > > > but below is different > > as sometimes the situation is not range(0,3), but it a a list of tuple > > > > iiii = 0 > > jjjj = 0 > > for ii in range(0,3): > > for jj in range(0,3): > > if iiii < jjjj: > > print (iiii, jjjj) > > jjjj = jjjj + 1 > > iiii = iiii + 1 > > > > how to make this situation return result like the first case? From __peter__ at web.de Sat Jul 1 06:00:04 2017 From: __peter__ at web.de (Peter Otten) Date: Sat, 01 Jul 2017 12:00:04 +0200 Subject: how to make this situation return this result? References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> Message-ID: Ho Yeung Lee wrote: > expect result as this first case > > ii = 0 > jj = 0 > for ii in range(0,3): > for jj in range(0,3): > if ii < jj: > print (ii, jj) > > > but below is different > as sometimes the situation is not range(0,3), but it a a list of tuple > > iiii = 0 > for ii in range(0,3): # jj starts with 0 on invocation of the inner loop # to get that with an independent counter you have to #initialise it explicitly: jjjj = 0 > for jj in range(0,3): > if iiii < jjjj: > print (iiii, jjjj) > jjjj = jjjj + 1 > iiii = iiii + 1 > > how to make this situation return result like the first case? Because you don't reset jjjj the condition iiii < jjjj is always True the second and third time the inner loop is executed. From dhariyalbhaskar at gmail.com Sat Jul 1 06:24:05 2017 From: dhariyalbhaskar at gmail.com (Bhaskar Dhariyal) Date: Sat, 1 Jul 2017 15:54:05 +0530 Subject: Combining 2 data series into one In-Reply-To: References: <0c051266-a672-44e2-982f-2b69376775b2@googlegroups.com> <3ea981f5-74fb-4f4c-b126-4b5f31c245ba@googlegroups.com> Message-ID: Thanks Albert! I have successfully completed the project. Thanks all for your support. On Sat, Jul 1, 2017 at 1:59 PM, Albert-Jan Roskam wrote: > Hi, > > Does your code run on a sample of the data? > Does your code have categorical data in it? If so: > https://pandas.pydata.org/pandas-docs/stable/categorical.html. Also, > check out http://www.pytables.org. > > Albert-Jan > ------------------------------ > *From:* Python-list hotmail.com at python.org> on behalf of Bhaskar Dhariyal < > dhariyalbhaskar at gmail.com> > *Sent:* Thursday, June 29, 2017 4:34:56 AM > *To:* python-list at python.org > *Subject:* Re: Combining 2 data series into one > > On Wednesday, 28 June 2017 23:43:57 UTC+5:30, Albert-Jan Roskam wrote: > > (sorry for top posting) > > Yes, I'd try pd.concat([df1, df2]). > > Or this: > > df['both_names'] = df.apply(lambda row: row.name + ' ' + row.surname, > axis=1) > > ________________________________ > > From: Python-list hotmail.com at python.org> on behalf of Paul Barry < > paul.james.barry at gmail.com> > > Sent: Wednesday, June 28, 2017 12:30:25 PM > > To: Bhaskar Dhariyal > > Cc: python-list at python.org > > Subject: Re: Combining 2 data series into one > > > > Maybe look at using .concat instead of + > > > > See: > > http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/ > blob/master/notebooks/03.06-Concat-And-Append.ipynb > > > > On 28 June 2017 at 13:02, Paul Barry wrote: > > > > > > > > Maybe try your code on a sub-set of your data - perhaps 1000 lines of > > > data? - to see if that works. > > > > > > Anyone else on the list suggest anything to try here? > > > > > > On 28 June 2017 at 12:50, Bhaskar Dhariyal > > > wrote: > > > > > >> No it didn't work. I am getting memory error. Using 32GB RAM system > > >> > > >> On Wed, Jun 28, 2017 at 5:17 PM, Paul Barry < > paul.james.barry at gmail.com> > > >> wrote: > > >> > > >>> On the line that's failing, your code is this: > > >>> > > >>> combinedX=combinedX+dframe['tf'] > > >>> > > >>> which uses combinedX on both sides of the assignment statement - note > > >>> that Python is reporting a 'MemoryError", which may be happening due > to > > >>> this "double use" (maybe). What happens if you create a new > dataframe, > > >>> like this: > > >>> > > >>> newX = combinedX + dframe['tf'] > > >>> > > >>> Regardless, it looks like you are doing a dataframe merge. Jake V's > > >>> book has an excellent section on it here: http://nbviewer.jupyter. > > >>> org/github/jakevdp/PythonDataScienceHandbook/blob/master/not > > >>> ebooks/03.07-Merge-and-Join.ipynb - this should take about 20 > minutes > > >>> to read, and may be of use to you. > > >>> > > >>> Paul. > > >>> > > >>> > > >>> > > >>> On 28 June 2017 at 12:19, Bhaskar Dhariyal < > dhariyalbhaskar at gmail.com> > > >>> wrote: > > >>> > > >>>> On Wednesday, 28 June 2017 14:43:48 UTC+5:30, Paul Barry wrote: > > >>>> > This should do it: > > >>>> > > > >>>> > >>> import pandas as pd > > >>>> > >>> > > >>>> > >>> df1 = pd.DataFrame(['bhaskar', 'Rohit'], > columns=['first_name']) > > >>>> > >>> df1 > > >>>> > first_name > > >>>> > 0 bhaskar > > >>>> > 1 Rohit > > >>>> > >>> df2 = pd.DataFrame(['dhariyal', 'Gavval'], > columns=['last_name']) > > >>>> > >>> df2 > > >>>> > last_name > > >>>> > 0 dhariyal > > >>>> > 1 Gavval > > >>>> > >>> df = pd.DataFrame() > > >>>> > >>> df['name'] = df1['first_name'] + ' ' + df2['last_name'] > > >>>> > >>> df > > >>>> > name > > >>>> > 0 bhaskar dhariyal > > >>>> > 1 Rohit Gavval > > >>>> > >>> > > >>>> > > > >>>> > Again, I draw your attention to Jake VanderPlas's excellent book, > > >>>> which is > > >>>> > available for free on the web. All of these kind of data > > >>>> manipulations are > > >>>> > covered there: https://github.com/jakevdp/ > PythonDataScienceHandbook > > >>>> - the > > >>>> > hard copy is worth owning too (if you plan to do a lot of work > using > > >>>> > numpy/pandas). > > >>>> > > > >>>> > I'd also recommend the upcoming 2nd edition of Wes McKinney's > "Python > > >>>> for > > >>>> > Data Analysis" book - I've just finished tech reviewing it for > > >>>> O'Reilly, > > >>>> > and it is very good, too - highly recommended. > > >>>> > > > >>>> > Regards. > > >>>> > > > >>>> > Paul. > > >>>> > > > >>>> > On 28 June 2017 at 07:11, Bhaskar Dhariyal < > dhariyalbhaskar at gmail.com > > >>>> > > > >>>> > wrote: > > >>>> > > > >>>> > > Hi! > > >>>> > > > > >>>> > > I have 2 dataframe i.e. df1['first_name'] and df2['last_name']. > I > > >>>> want to > > >>>> > > make it as df['name']. How to do it using pandas dataframe. > > >>>> > > > > >>>> > > first_name > > >>>> > > ---------- > > >>>> > > bhaskar > > >>>> > > Rohit > > >>>> > > > > >>>> > > > > >>>> > > last_name > > >>>> > > ----------- > > >>>> > > dhariyal > > >>>> > > Gavval > > >>>> > > > > >>>> > > should appear as > > >>>> > > > > >>>> > > name > > >>>> > > ---------- > > >>>> > > bhaskar dhariyal > > >>>> > > Rohit Gavval > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > Thanks > > >>>> > > -- > > >>>> > > https://mail.python.org/mailman/listinfo/python-list > > >>>> > > > > >>>> > > > >>>> > > > >>>> > > > >>>> > -- > > >>>> > Paul Barry, t: @barrypj - w: > > >>>> > http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > > >>>> > Lecturer, Computer Networking: Institute of Technology, Carlow, > > >>>> Ireland. > > >>>> > > >>>> https://drive.google.com/open?id=0Bw2Avni0DUa3aFJKdC1Xd2trM2c > > >>>> link to code > > >>>> -- > > >>>> https://mail.python.org/mailman/listinfo/python-list > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Paul Barry, t: @barrypj - w: > > >>> http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > > >>> Lecturer, Computer Networking: Institute of Technology, Carlow, > Ireland. > > >>> > > >> > > >> > > > > > > > > > -- > > > Paul Barry, t: @barrypj - w: > > > http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > > > Lecturer, Computer Networking: Institute of Technology, Carlow, > Ireland. > > > > > > > > > > > -- > > Paul Barry, t: @barrypj - w: > > http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie > > Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > > -- > > https://mail.python.org/mailman/listinfo/python-list > > Hi Albert! > Thanks for replying. > That issue was resolved. But I m struck with a new problem. > I generated tfidf representation for pandas dataframe where each row > contains some text. I also had some numerical feature which I wanted to > combine with tfidf matrix. But this is giving memory error. > -- > https://mail.python.org/mailman/listinfo/python-list > From jobmattcon at gmail.com Sat Jul 1 06:44:30 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sat, 1 Jul 2017 03:44:30 -0700 (PDT) Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> Message-ID: <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> finally i searched dict.values()[index] solved this On Saturday, July 1, 2017 at 6:00:41 PM UTC+8, Peter Otten wrote: > Ho Yeung Lee wrote: > > > expect result as this first case > > > > ii = 0 > > jj = 0 > > for ii in range(0,3): > > for jj in range(0,3): > > if ii < jj: > > print (ii, jj) > > > > > > but below is different > > as sometimes the situation is not range(0,3), but it a a list of tuple > > > > iiii = 0 > > > for ii in range(0,3): > # jj starts with 0 on invocation of the inner loop > # to get that with an independent counter you have to > #initialise it explicitly: > jjjj = 0 > > for jj in range(0,3): > > if iiii < jjjj: > > print (iiii, jjjj) > > jjjj = jjjj + 1 > > iiii = iiii + 1 > > > > how to make this situation return result like the first case? > > Because you don't reset jjjj the condition iiii < jjjj is always True the > second and third time the inner loop is executed. From tjol at tjol.eu Sat Jul 1 06:46:31 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 1 Jul 2017 12:46:31 +0200 Subject: DJANGO cannot import name _compare_digest In-Reply-To: References: <78efb412-7b8f-4aea-926f-e4f418805c45@googlegroups.com> <742121c3-4a51-4091-802a-8e7d65a0979d@googlegroups.com> Message-ID: <59577e49$0$1799$e4fe514c@news.kpn.nl> On 30/06/17 13:32, Pavol Lisy wrote: > [snip] > > python 3.6.1 works as I expected > >>>> import logging as operator >>>> from operator import _compare_digest as compare_digest > Traceback (most recent call last): > File "", line 1, in > ImportError: cannot import name '_compare_digest' > All you're seeing here is that operator._compare_digest doesn't exist in python3. The behaviour of from ... import has not changed. Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from operator import _compare_digest Traceback (most recent call last): File "", line 1, in ImportError: cannot import name '_compare_digest' >>> import logging as os >>> from os import walk >>> walk.__module__ 'os' -- Thomas From __peter__ at web.de Sat Jul 1 06:59:42 2017 From: __peter__ at web.de (Peter Otten) Date: Sat, 01 Jul 2017 12:59:42 +0200 Subject: how to make this situation return this result? References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: Ho Yeung Lee wrote: > finally i searched dict.values()[index] solved this That doesn't look like a good solution to anything -- including "this", whatever it may be ;) If you make an effort to better explain your problem in plain english rather than with code examples you are likely tho get better answers. From ozovozovozo202 at gmail.com Sat Jul 1 07:46:21 2017 From: ozovozovozo202 at gmail.com (Debiller 777) Date: Sat, 1 Jul 2017 04:46:21 -0700 (PDT) Subject: Python installer In-Reply-To: References: Message-ID: d From ozovozovozo202 at gmail.com Sat Jul 1 08:01:42 2017 From: ozovozovozo202 at gmail.com (Debiller 777) Date: Sat, 1 Jul 2017 05:01:42 -0700 (PDT) Subject: Python installer In-Reply-To: References: Message-ID: <8860b65f-fc33-4d03-bf5c-9e360b21062a@googlegroups.com> ???????, 1 ???? 2017 ?., 0:25:19 UTC+3 ???????????? eryk sun ???????: > On Fri, Jun 30, 2017 at 8:30 PM, Debiller 777 wrote: > > I just get error that there is no module name 'encodings' > > First make sure that neither PYTHONHOME nor PYTHONPATH are defined in > your environment. To check this type `set python` in a command prompt. > Neither variable should be listed. Thank you! From jobmattcon at gmail.com Sat Jul 1 08:45:17 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sat, 1 Jul 2017 05:45:17 -0700 (PDT) Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: just want to compare tuples like index (0,1), (0,2), (1,2) without duplicate such as (2,0), (1,0) etc On Saturday, July 1, 2017 at 7:00:17 PM UTC+8, Peter Otten wrote: > Ho Yeung Lee wrote: > > > finally i searched dict.values()[index] solved this > > That doesn't look like a good solution to anything -- including "this", > whatever it may be ;) > > If you make an effort to better explain your problem in plain english rather > than with code examples you are likely tho get better answers. From __peter__ at web.de Sat Jul 1 09:10:14 2017 From: __peter__ at web.de (Peter Otten) Date: Sat, 01 Jul 2017 15:10:14 +0200 Subject: how to make this situation return this result? References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: Ho Yeung Lee wrote: > just want to compare tuples like index (0,1), (0,2), (1,2) without > duplicate > such as (2,0), (1,0) etc Consider (frozen)sets: >>> {1, 2} == {2, 1} True >>> unique_items = {frozenset((a, b)) for a in range(3) for b in range(3)} >>> unique_items set([frozenset([0]), frozenset([1, 2]), frozenset([0, 2]), frozenset([1]), frozenset([2]), frozenset([0, 1])]) > On Saturday, July 1, 2017 at 7:00:17 PM UTC+8, Peter Otten wrote: >> Ho Yeung Lee wrote: >> >> > finally i searched dict.values()[index] solved this >> >> That doesn't look like a good solution to anything -- including "this", >> whatever it may be ;) >> >> If you make an effort to better explain your problem in plain english >> rather than with code examples you are likely tho get better answers. From breamoreboy at gmail.com Sat Jul 1 11:15:49 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Sat, 1 Jul 2017 08:15:49 -0700 (PDT) Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: On Saturday, July 1, 2017 at 1:46:21 PM UTC+1, Ho Yeung Lee wrote: > just want to compare tuples like index (0,1), (0,2), (1,2) without duplicate > such as (2,0), (1,0) etc > I'm still not entirely sure what you're asking, but can't you just generate what you want with itertools combinations, something like:- >>> import itertools >>> tuples = list(itertools.combinations(range(3), 2)) >>> tuples [(0, 1), (0, 2), (1, 2)] From walters.justin01 at gmail.com Sat Jul 1 11:33:22 2017 From: walters.justin01 at gmail.com (justin walters) Date: Sat, 1 Jul 2017 08:33:22 -0700 Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: On Sat, Jul 1, 2017 at 5:45 AM, Ho Yeung Lee wrote: > just want to compare tuples like index (0,1), (0,2), (1,2) without > duplicate > such as (2,0), (1,0) etc > > I'm going to assume that the order of values in the tuple is important to you. If so, you can simply use the `==` operator to compare them. For instance: ``` a = (0, 1) b = (0, 1) a == b >>> True a = (1, 0) b = (0, 1) a == b >>> False ``` Using the `is` operator will return `False` as a and b are completely independent objects. ``` a = (0, 1) b = (0, 1) a is b >>> False ``` From pavol.lisy at gmail.com Sat Jul 1 11:55:19 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 1 Jul 2017 17:55:19 +0200 Subject: DJANGO cannot import name _compare_digest In-Reply-To: <59577e49$0$1799$e4fe514c@news.kpn.nl> References: <78efb412-7b8f-4aea-926f-e4f418805c45@googlegroups.com> <742121c3-4a51-4091-802a-8e7d65a0979d@googlegroups.com> <59577e49$0$1799$e4fe514c@news.kpn.nl> Message-ID: On 7/1/17, Thomas Jollans wrote: > On 30/06/17 13:32, Pavol Lisy wrote: >> [snip] >> >> python 3.6.1 works as I expected >> >>>>> import logging as operator >>>>> from operator import _compare_digest as compare_digest >> Traceback (most recent call last): >> File "", line 1, in >> ImportError: cannot import name '_compare_digest' >> > > All you're seeing here is that operator._compare_digest doesn't exist in > python3. You are right. I am sorry that I overlook this. From pavol.lisy at gmail.com Sat Jul 1 12:48:26 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 1 Jul 2017 18:48:26 +0200 Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: On 7/1/17, Ho Yeung Lee wrote: > just want to compare tuples like index (0,1), (0,2), (1,2) without > duplicate > such as (2,0), (1,0) etc > > > On Saturday, July 1, 2017 at 7:00:17 PM UTC+8, Peter Otten wrote: >> Ho Yeung Lee wrote: >> >> > finally i searched dict.values()[index] solved this >> >> That doesn't look like a good solution to anything -- including "this", >> whatever it may be ;) >> >> If you make an effort to better explain your problem in plain english >> rather >> than with code examples you are likely tho get better answers. > > -- > https://mail.python.org/mailman/listinfo/python-list > [(i, j) for j in range(3) for i in range(j)] # is this good for you? From jobmattcon at gmail.com Sat Jul 1 13:00:42 2017 From: jobmattcon at gmail.com (Lee Ho Yeung) Date: Sat, 01 Jul 2017 17:00:42 +0000 Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: My situation is a dictionary with tuple key I think dictionary.values()[index] Is correct On Sun, 2 Jul 2017 at 12:48 AM, Pavol Lisy wrote: > On 7/1/17, Ho Yeung Lee wrote: > > just want to compare tuples like index (0,1), (0,2), (1,2) without > > duplicate > > such as (2,0), (1,0) etc > > > > > > On Saturday, July 1, 2017 at 7:00:17 PM UTC+8, Peter Otten wrote: > >> Ho Yeung Lee wrote: > >> > >> > finally i searched dict.values()[index] solved this > >> > >> That doesn't look like a good solution to anything -- including "this", > >> whatever it may be ;) > >> > >> If you make an effort to better explain your problem in plain english > >> rather > >> than with code examples you are likely tho get better answers. > > > > -- > > https://mail.python.org/mailman/listinfo/python-list > > > > [(i, j) for j in range(3) for i in range(j)] # is this good for you? > From torriem at gmail.com Sat Jul 1 13:44:45 2017 From: torriem at gmail.com (Michael Torrie) Date: Sat, 1 Jul 2017 11:44:45 -0600 Subject: how to make this situation return this result? In-Reply-To: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> Message-ID: <0d557b21-5df1-9803-621a-6319c567143c@gmail.com> On 07/01/2017 02:55 AM, Ho Yeung Lee wrote: > expect result as this first case > > ii = 0 > jj = 0 > for ii in range(0,3): > for jj in range(0,3): > if ii < jj: > print (ii, jj) > > > but below is different > as sometimes the situation is not range(0,3), but it a a list of tuple > > iiii = 0 > jjjj = 0 > for ii in range(0,3): > for jj in range(0,3): > if iiii < jjjj: > print (iiii, jjjj) > jjjj = jjjj + 1 > iiii = iiii + 1 > > how to make this situation return result like the first case? I hope your production code uses brief but descriptive variable names and also uses comments (either there was a big sale on i's and j's or your other letters are broken), unlike the code you just posted! Might help you figure out your logic and also let others more quickly grasp what you're thing to do. From breamoreboy at gmail.com Sat Jul 1 15:00:18 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Sat, 1 Jul 2017 12:00:18 -0700 (PDT) Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: <7501ca2b-e105-43ad-8308-e0fbf4116737@googlegroups.com> On Saturday, July 1, 2017 at 6:55:59 PM UTC+1, Ho Yeung Lee wrote: > My situation is a dictionary with tuple key > I think dictionary.values()[index] > Is correct > This is the second time you've said this and it makes no more sense now than it did the first time. Please explain exactly what you are trying to achieve. Kindest regards. Mark Lawrence. From tjol at tjol.eu Sat Jul 1 15:24:03 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Sat, 1 Jul 2017 21:24:03 +0200 Subject: how to make this situation return this result? In-Reply-To: References: <87c12536-fdf1-43af-b204-fe04205ccbcf@googlegroups.com> <3d8a1469-d7f5-4333-9b75-bd56f3027b88@googlegroups.com> Message-ID: <4154bf46-1d94-e1c2-7560-7ece450db536@tjol.eu> On 01/07/17 19:00, Lee Ho Yeung wrote: > My situation is a dictionary with tuple key > I think dictionary.values()[index] > Is correct Unless your dictionary only has one element, this is almost certainly incorrect as dictionary items are not ordered: if this returns the right object, then it's by sheer luck. This could return something else on a different computer, or a different version of Python. In Python 3, you can't even do this. (By the way, you should be using Python 3!) Normally when you want to access a particular item of a dictionary you want to be indexing the dictionary itself: dictionary[key] -- Thomas > > > On Sun, 2 Jul 2017 at 12:48 AM, Pavol Lisy wrote: > >> On 7/1/17, Ho Yeung Lee wrote: >>> just want to compare tuples like index (0,1), (0,2), (1,2) without >>> duplicate >>> such as (2,0), (1,0) etc >>> >>> >>> On Saturday, July 1, 2017 at 7:00:17 PM UTC+8, Peter Otten wrote: >>>> Ho Yeung Lee wrote: >>>> >>>>> finally i searched dict.values()[index] solved this >>>> That doesn't look like a good solution to anything -- including "this", >>>> whatever it may be ;) >>>> >>>> If you make an effort to better explain your problem in plain english >>>> rather >>>> than with code examples you are likely tho get better answers. >>> -- >>> https://mail.python.org/mailman/listinfo/python-list >>> >> [(i, j) for j in range(3) for i in range(j)] # is this good for you? >> From meek at inherit.earth Sat Jul 1 15:37:45 2017 From: meek at inherit.earth (Timid) Date: 1 Jul 2017 19:37:45 GMT Subject: SLRN score-file suggestion Message-ID: [comp.lang.python] Score: =-1 Path: netfront\.net From breamoreboy at gmail.com Sat Jul 1 17:19:22 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Sat, 1 Jul 2017 14:19:22 -0700 (PDT) Subject: Can we please dump google groups completely? Message-ID: <0ade795c-5d5d-44e2-96ba-6bdbb142914c@googlegroups.com> Yes I know it's daft that it's where I'm posting from, but I'm still banned from using the main mailing list. I've reported over 80 posts today alone, meaning that it's less than useless for anybody who is seriously interested in Python. wxpython did the same years ago, why can't we? Kindest regards. Mark Lawrence. From jugurtha.hadjar at gmail.com Sat Jul 1 18:48:56 2017 From: jugurtha.hadjar at gmail.com (Jugurtha Hadjar) Date: Sat, 1 Jul 2017 23:48:56 +0100 Subject: Best way to ensure user calls methods in correct order? In-Reply-To: References: <6cb78693-2e3b-5b6f-fe7b-b1455e9e048c@gmx.com> <594bd6d3$0$1589$c3e8da3$5496439d@news.astraweb.com> <594e375b$0$1694$e4fe514c@news.kpn.nl> Message-ID: <300e17c7-8e53-0d68-2bf1-65f837b1f527@gmail.com> A few questions: It looks to me you have a problem that looks a lot like plant processes where a process' output is another process' input. Q1: Am I correct in describing your problem as follows? input ---> |process a|---> |process b| ---> |process c| ---> output Q2: Does every process change the type of its input to something not produced by other processes and not accepted by other processes? i.e: if the first input is a list: - Does |process a| accept a list and output a dict. - Does |process b| accept a dict and output a tuple. - Does |process c| accept a tuple and output a set. You can do the following: def pipeline(obj): mapping = { list: lambda: pipeline(a(obj)), dict: lambda: pipeline(b(obj)), tuple: lambda: c(obj), } return mapping[type(obj)]() This way, calling process on a list will: - Call a on it --> dict, then call pipeline on that dict which will: - Call b on it --> tuple, then call pipeline on that tuple which will: - Call c on it --> set, then will return that (last step). You can also do it this way: def pipeline(obj): mapping = { list: lambda: pipeline(a(obj)), dict: lambda: pipeline(b(obj)), tuple: lambda: pipeline(c(obj)), set: lambda: obj, } return mapping[type(obj)]() However, if you call pipeline on the result of say, b (tuple).. Then it will call c on that object, which gives a set, then call pipeline on that set, which returns the set unchanged.. In other words, whatever the step your data is, calling pipeline on it will automatically place it at the processing step. This is for the first case.. If however you can't discriminate based on type, you can create your own "types" with custom classes or namedtuples. If `a` accepts a list and outputs a list, and b accepts a list and outputs a list, and c accepts a list and outputs a list... You can just do: class OutputA(list): """Result of a list processed by `a`.""" class OutputB(list): """Result of OutputA processed by `b`.""" class OutputC(list): """Result of OutputB processed by `c`.""" def pipeline(obj): mapping = { list: lambda: pipeline(a(obj)), OutputA: lambda: pipeline(b(obj)), OutputB: lambda: pipeline(c(obj)), OutputC: lambda: obj, } return mapping[type(obj)]() You'll just have to change your functions like this: def a(list_obj): ... _a_result = the_resulting_list_you_are_about_to_return return OutputA(_a_result) def b(list_from_a): _b_result = also_a_list_you_were_returning return OutputB(_b_result) def c(list_from_b): _c_result = you_get_the_point return OutputC(_c_result) -- ~Jugurtha Hadjar, From rantingrickjohnson at gmail.com Sat Jul 1 20:39:05 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 1 Jul 2017 17:39:05 -0700 (PDT) Subject: Can we please dump google groups completely? In-Reply-To: <0ade795c-5d5d-44e2-96ba-6bdbb142914c@googlegroups.com> References: <0ade795c-5d5d-44e2-96ba-6bdbb142914c@googlegroups.com> Message-ID: <1f6e96d3-133b-4a99-a415-112d9df1eefa@googlegroups.com> On Saturday, July 1, 2017 at 4:19:47 PM UTC-5, bream... at gmail.com wrote: > Yes I know it's daft that it's where I'm posting from, but > I'm still banned from using the main mailing list. Why are you banned from Python-list? What did you do? And is that why you have moved to the "Bream" nym? > I've reported over 80 posts today alone, meaning that it's > less than useless for anybody who is seriously interested > in Python. Yet you're still here. It baffes the mind! > wxpython did the same years ago, why can't we? Does anyone even use WxPython? I mean, i know it to be a quite feature rich GUI library and all, but it's not very Pythonic, and seg-faults are an all-too-common annoyance From jobmattcon at gmail.com Sat Jul 1 23:38:13 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sat, 1 Jul 2017 20:38:13 -0700 (PDT) Subject: Is there library to convert AST to DAG tree? Message-ID: <9357974e-970c-4497-bf2a-301597b250ea@googlegroups.com> Is there library to convert AST to DAG tree? From phamp at mindspring.com Sat Jul 1 23:50:25 2017 From: phamp at mindspring.com (pyotr filipivich) Date: Sat, 01 Jul 2017 20:50:25 -0700 Subject: Can we please dump google groups completely? References: <0ade795c-5d5d-44e2-96ba-6bdbb142914c@googlegroups.com> Message-ID: breamoreboy at gmail.com on Sat, 1 Jul 2017 14:19:22 -0700 (PDT) typed in comp.lang.python the following: >Yes I know it's daft that it's where I'm posting from, but I'm still banned from using the main mailing list. I've reported over 80 posts today alone, meaning that it's less than useless for anybody who is seriously interested in Python. wxpython did the same years ago, why can't we? > >Kindest regards. > >Mark Lawrence. Um, on Usenet, you don't get that option (to ban a source). OTOH, you could take it up with your ISP admin. -- pyotr filipivich Next month's Panel: Graft - Boon or blessing? From rosuav at gmail.com Sun Jul 2 01:46:21 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 2 Jul 2017 15:46:21 +1000 Subject: Is there library to convert AST to DAG tree? In-Reply-To: <9357974e-970c-4497-bf2a-301597b250ea@googlegroups.com> References: <9357974e-970c-4497-bf2a-301597b250ea@googlegroups.com> Message-ID: On Sun, Jul 2, 2017 at 1:38 PM, Ho Yeung Lee wrote: > Is there library to convert AST to DAG tree? Given that a syntax tree IS a form of directed acyclic graph, I think your question is either trivially true or insufficiently clear. ChrisA From irmen.NOSPAM at xs4all.nl Sun Jul 2 05:03:08 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Sun, 2 Jul 2017 11:03:08 +0200 Subject: pythonhosted.org status? Message-ID: <5958b6ce$0$774$e4fe514c@news.xs4all.nl> Hi, I'm using pythonhosted.org to host the docs for various projects but it has either been very slow or unavailable over the past week. Anyone else having the same problems? Should I perhaps consider putting my docs on readthedocs.org instead? Irmen From breamoreboy at gmail.com Sun Jul 2 05:27:46 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Sun, 2 Jul 2017 02:27:46 -0700 (PDT) Subject: pythonhosted.org status? In-Reply-To: <5958b6ce$0$774$e4fe514c@news.xs4all.nl> References: <5958b6ce$0$774$e4fe514c@news.xs4all.nl> Message-ID: On Sunday, July 2, 2017 at 10:03:34 AM UTC+1, Irmen de Jong wrote: > Hi, > I'm using pythonhosted.org to host the docs for various projects but it has either been > very slow or unavailable over the past week. Anyone else having the same problems? > Should I perhaps consider putting my docs on readthedocs.org instead? > > Irmen I get:- "Service Unavailable The service is temporarily unavailable. Please try again later." http://downforeveryoneorjustme.com says it's down. Kindest regards. Mark Lawrence. From jobmattcon at gmail.com Sun Jul 2 09:28:35 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sun, 2 Jul 2017 06:28:35 -0700 (PDT) Subject: how to create this dependency table from ast? Message-ID: <71618d26-63a2-41be-846f-3b185abd6384@googlegroups.com> i find parseprint function not exist in python 2.7 goal is to create a table graph = {'A': ['B', 'C'], 'B': ['C', 'D'], 'C': ['D'], 'D': ['C'], 'E': ['F'], 'F': ['C']} from a = 1 b = 1 c = a + b d = c e = c file = open(r"C:\Users\hello\Documents\testingsource.py", "r") root = ast.parse(file.read()) a b \ / c / \ d e From jobmattcon at gmail.com Sun Jul 2 09:53:31 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sun, 2 Jul 2017 06:53:31 -0700 (PDT) Subject: how to create this dependency table from ast? In-Reply-To: References: <71618d26-63a2-41be-846f-3b185abd6384@googlegroups.com> Message-ID: On Sunday, July 2, 2017 at 9:32:36 PM UTC+8, ad... at python.org wrote: > admin at python.org: > > Hi, Ho! > > > it is crucial that you dump that fucking Windows of yours and become > real pythonic under Linux ! i do not understand what is difference in result if run in window and linux goal is to create a table graph = {'A': ['C'], 'B': ['C'], 'C': ['D'], 'C': ['E']} from a = 1 b = 1 c = a + b d = c e = c Python 2.7.6 (default, Oct 26 2016, 20:30:19) [GCC 4.8.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import ast >>> parseprint("a = 1") Traceback (most recent call last): File "", line 1, in NameError: name 'parseprint' is not defined actually running in bash subsystem of window there is no parseprint to see which attributes for craft the goal From jobmattcon at gmail.com Sun Jul 2 09:59:28 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sun, 2 Jul 2017 06:59:28 -0700 (PDT) Subject: how to create this dependency table from ast? In-Reply-To: References: <71618d26-63a2-41be-846f-3b185abd6384@googlegroups.com> Message-ID: <52f6b419-c201-4bc2-97ae-d363f5cc5414@googlegroups.com> On Sunday, July 2, 2017 at 9:53:50 PM UTC+8, Ho Yeung Lee wrote: > On Sunday, July 2, 2017 at 9:32:36 PM UTC+8, ad... at python.org wrote: > > admin at python.org: > > > Hi, Ho! > > > > > > it is crucial that you dump that fucking Windows of yours and become > > real pythonic under Linux ! > > i do not understand what is difference in result if run in window and linux > > goal is to create a table > > graph = {'A': ['C'], > 'B': ['C'], > 'C': ['D'], > 'C': ['E']} > > from > > a = 1 > b = 1 > c = a + b > d = c > e = c > > Python 2.7.6 (default, Oct 26 2016, 20:30:19) > [GCC 4.8.4] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import ast > >>> parseprint("a = 1") > Traceback (most recent call last): > File "", line 1, in > NameError: name 'parseprint' is not defined > > actually running in bash subsystem of window > there is no parseprint to see which attributes for craft the goal in fact, i had myself only. i do not know others. of course dump my own window result. From jobmattcon at gmail.com Sun Jul 2 10:18:28 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sun, 2 Jul 2017 07:18:28 -0700 (PDT) Subject: how to create this dependency table from ast? In-Reply-To: <52f6b419-c201-4bc2-97ae-d363f5cc5414@googlegroups.com> References: <71618d26-63a2-41be-846f-3b185abd6384@googlegroups.com> <52f6b419-c201-4bc2-97ae-d363f5cc5414@googlegroups.com> Message-ID: On Sunday, July 2, 2017 at 9:59:48 PM UTC+8, Ho Yeung Lee wrote: > On Sunday, July 2, 2017 at 9:53:50 PM UTC+8, Ho Yeung Lee wrote: > > On Sunday, July 2, 2017 at 9:32:36 PM UTC+8, ad... at python.org wrote: > > > admin at python.org: > > > > Hi, Ho! > > > > > > > > > it is crucial that you dump that fucking Windows of yours and become > > > real pythonic under Linux ! > > > > i do not understand what is difference in result if run in window and linux > > > > goal is to create a table > > > > graph = {'A': ['C'], > > 'B': ['C'], > > 'C': ['D'], > > 'C': ['E']} > > > > from > > > > a = 1 > > b = 1 > > c = a + b > > d = c > > e = c > > > > Python 2.7.6 (default, Oct 26 2016, 20:30:19) > > [GCC 4.8.4] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> import ast > > >>> parseprint("a = 1") > > Traceback (most recent call last): > > File "", line 1, in > > NameError: name 'parseprint' is not defined > > > > actually running in bash subsystem of window > > there is no parseprint to see which attributes for craft the goal > > > in fact, i had myself only. i do not know others. of course dump my own window result. goal is to create dependency graph for variables From rantingrickjohnson at gmail.com Sun Jul 2 11:24:44 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 2 Jul 2017 08:24:44 -0700 (PDT) Subject: Teaching the "range" function in Python 3 In-Reply-To: References: Message-ID: <4709a0bc-e43f-40a4-bec6-3facf0613999@googlegroups.com> On Thursday, June 29, 2017 at 9:58:23 PM UTC-5, Chris Angelico wrote: > On Fri, Jun 30, 2017 at 12:33 PM, Rick Johnson > > A better *FIRST* example would be something like this: > > > > def add(x, y): > > return x + y > > > > When teaching a student about functions, the first step is > > to help them understand *WHY* they need to use functions, > > and the second is to teach them how to define a function. > > In my simplistic example, the practical necessity of > > functions can be easily intuited by the student without > > the distractions... > > ... except that you've made your first function so trivial > that it doesn't need to be a function. How does that help > anyone understand why they need functions? The point, Christopher, is to showcase in the most simple form possible, that functions package a block of code for easy reuse. Actually this example showcases three major elements of functions: (1) That functions accept arguments. (in this case, positional arguments. Keyword arguments and default arguments can be introducted in an advanced lesson) (2)That functions perform a specific task within the "body" of the function. (No need to complicate the example by turning it into a generator, calculating a Fibonacci sequence, starting a new thread, or creating a class factory for your effing bot-net!!!) (3) That functions can return a value to the caller. Easy peasy. ABC, 123. Get it? > What's the point of writing "add(5, 7)" rather than "5 + > 7"? Does the acronym "DRY" mean anything to you? Does the concept of "code reuse" mean anything to you? Does the concept of creating a "general solution" as opposed to a "specific solution" mean anything to you? Of course, no programmer would require such a basic "add" function like this since Python numeric types have mathematical operations available (it would be as practical as training wheels on a tour de france racing bike!), but the point is to offer simple examples that anyone can understand. For example, even a primary school student understands how to add two numbers together to produce a result, and has also been exposed to variables in basic algebra: x + 3 = 10 (What is the value of x?) 7 + y = 10 (What is the value of y?) So any student with even a basic understanding of mathematics can intuit that x and y are simply placeholders for numeric values. > So you need something else around the outside to justify > even having a function at all. Why? So i can teach a noob to write spaghetti code? Okay, just for the Lulz factor, do you care to offer a code example? I'd love to watch your squirm your way out of this one! > In the example in the tutorial, the details ... > > > ... of (1) doc-strings, (2) tuple unpacking, (3) an inner > > loop structure, (4) implicitly writing to IO streams using > > the print function, (5) using advanced features of the > > print function, (6) more tuple unpacking (this time with a > > twist of lime), (7) and an implicit newline insertion > > using the print function > > don't need to be explained at all! They *just work*. You > also don't have to explain in detail how garbage collection > works, the way that Python's object and type models work, > or how the electrons in your CPU are kept from portalling > across in a quantum tunnel. None of that matters. So I > support the tutorial's example over yours. Of course you do! Because you don't care if new programmers are presented with poor examples. From your POV, if someone else is suffering, it's not your problem. > Chris "not-so-Angelic, oh?!" From rantingrickjohnson at gmail.com Sun Jul 2 11:39:28 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 2 Jul 2017 08:39:28 -0700 (PDT) Subject: Teaching the "range" function in Python 3 In-Reply-To: References: Message-ID: On Saturday, July 1, 2017 at 12:48:39 AM UTC-5, Christian Gollwitzer wrote: > Am 30.06.17 um 04:33 schrieb Rick Johnson: > > And to further drive home the point, you can manually > > insert a list literal to prove this: > > > > >>> range(10) > > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > > >>> for value in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]: > > ... print(value) > > ... > > 0 > > 1 > > Now you have exactly missed the point that the OP was > asking about. In Python 2, yes, this works and it is the > way he has teached it to his students. Howver, in Python 3: Nah, I didn't miss the point, i just forgot to pass in a version number to my virtual python "fetch_answer" function -- which defaults to Python2.x def fetch_answer(question, pyver=2): return database[pyver].get(question, "Urmm???") > >>> range(10) > range(0, 10) > > This is not helpful to understand what range does [in > Python>=3.0], and this is the original question. Yeah, and thanks for underscoring the obvious. ;-) PS: Okay! Okay! So I forgot to call list on the range! So sue me!!! ;-) From nospam at mapson.xyz Sun Jul 2 13:45:39 2017 From: nospam at mapson.xyz (Mark-) Date: Sun, 2 Jul 2017 17:45:39 +0000 (UTC) Subject: Teaching References: Message-ID: admin at python.org wrote: > Irv Kalb: > > I teach Python at two colleges in Silicon Valley. > > > and I don't give a fuck about that. Wow, some of you folks are so civilized. From formisc at gmail.com Sun Jul 2 14:02:41 2017 From: formisc at gmail.com (Andrew Z) Date: Sun, 2 Jul 2017 11:02:41 -0700 (PDT) Subject: Proper architecture Message-ID: <795caf9e-271d-4483-a11c-a83422e43b73@googlegroups.com> Hello, I'd appreciate your suggestions for a better approach to the following task. I have 2 files ( 2 classes). One (ClassA) has all logic related to the main workflow of the program. Another (DB), I like to offload all operations with a DB ( sql3 in this case). I'm trying to pass the connection to the main class, but having problems. One of them, is i can't pass the conn as a parameter to the function in one (ClassA.abc()), because i inherit it ( function abc() ). I created a self.DBConnection field, but i'm not sure if i'm on the right path... Code is gutted to highlight the problem. Thank you --- code ----- one.py: from .DB import * class ClassA(OtherObject): def __init__(self): self.DBConnection = sql3.Connection def abc(self, reqId: int): DB.writeTicks(self,self.DBConnection,reqId)) DB.py: import sqlite3 as sql3 import sys from .tws import TWS from utils import current_fn_name class DB(object): db_location = '' # db_location = '../DB/pairs.db' def __init__(self, location='../DB/pairs.db'): db_location = location print(current_fn_name(),' self.db_location = {}'.format(db_location)) try: with open(db_location) as file: pass except IOError as e: print("Unable to locate the Db @ {}".format(db_location)) def reqConnection(self): try: con = sql3.connect(self.db_location) con.text_factory = str except sql3.Error as e: print("Error %s:".format( e.args[0])) sys.exit(1) return con def write(self, con : sql3.Connection, tickId: int): con.execute( blah) From alister.ware at ntlworld.com Sun Jul 2 14:15:11 2017 From: alister.ware at ntlworld.com (alister) Date: Sun, 02 Jul 2017 18:15:11 GMT Subject: Teaching References: Message-ID: On Sun, 02 Jul 2017 17:45:39 +0000, Mark- wrote: > admin at python.org wrote: > >> Irv Kalb: >> > I teach Python at two colleges in Silicon Valley. >> >> >> and I don't give a fuck about that. > > Wow, some of you folks are so civilized. It's not normally like this. We usually get a much higher class of troll who at least look like they are trying to make sense. I am not sure where this infant who thinks bad words are cool came from hopefully he will get bored & grow up shortly. -- Beware of the Turing Tar-pit in which everything is possible but nothing of interest is easy. From breamoreboy at gmail.com Sun Jul 2 14:30:55 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Sun, 2 Jul 2017 11:30:55 -0700 (PDT) Subject: how to create this dependency table from ast? In-Reply-To: References: <71618d26-63a2-41be-846f-3b185abd6384@googlegroups.com> Message-ID: On Sunday, July 2, 2017 at 2:32:36 PM UTC+1, ad... at python.org wrote: > admin at python.org: > > Hi, Ho! > > > it is crucial that you dump that fucking Windows of yours and become > real pythonic under Linux ! Isn't this spammer, or is it spanner, cute? I'm rather upset that he's been duplicating my name and that of Chris Angelico on the dread Python google groups, so do I have to keep asking for this scumbag to be locked out? Of course I'm a very naughty boy myself, but accepting my autism doesn't fit in with with the crap code of conduct here I've got used to. I still can't reply on the main list, showing the dual standards that the moderators show. Why do I bother? Because I care about the community. Yours most upsettingly. Mark Lawrence. From cl at isbd.net Sun Jul 2 16:35:31 2017 From: cl at isbd.net (Chris Green) Date: Sun, 2 Jul 2017 21:35:31 +0100 Subject: Can we please dump google groups completely? References: <0ade795c-5d5d-44e2-96ba-6bdbb142914c@googlegroups.com> Message-ID: admin at python.org wrote: > > after I censored all your posts, the forum immediately became well What forum is that? -- Chris Green ? From marko at pacujo.net Sun Jul 2 16:45:56 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Sun, 02 Jul 2017 23:45:56 +0300 Subject: Teaching References: Message-ID: <87wp7qtvnf.fsf@elektro.pacujo.net> alister : > Beware of the Turing Tar-pit in which everything is possible but > nothing of interest is easy. I understand the pitfalls, but the alternatives are simply untenable. In particular, rule languages are the spawn of the devil. Marko From aaronngray.lists at gmail.com Sun Jul 2 17:33:51 2017 From: aaronngray.lists at gmail.com (Aaron Gray) Date: Sun, 2 Jul 2017 22:33:51 +0100 Subject: Problem getting unittest tests for existing project working Message-ID: I am trying to get distorm3's unittests working but to no avail. I am not really a Python programmer so was hoping someone in the know maybe able to fix this for me. Here's a GitHub issue I have created for the bug :- https://github.com/gdabah/distorm/issues/118 -- Aaron Gray Independent Open Source Software Engineer, Computer Language Researcher, Information Theorist, and amateur computer scientist. From rosuav at gmail.com Sun Jul 2 18:13:57 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 3 Jul 2017 08:13:57 +1000 Subject: Can we please dump google groups completely? In-Reply-To: References: <0ade795c-5d5d-44e2-96ba-6bdbb142914c@googlegroups.com> Message-ID: On Mon, Jul 3, 2017 at 6:35 AM, Chris Green wrote: > admin at python.org wrote: >> >> after I censored all your posts, the forum immediately became well > > What forum is that? NOTE: Posts on Google Groups are showing up with a variety of forged email addresses, including "admin at python.org". Do not trust email addresses as credentials. ChrisA From cs at zip.com.au Sun Jul 2 20:14:23 2017 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 3 Jul 2017 10:14:23 +1000 Subject: Proper architecture In-Reply-To: <795caf9e-271d-4483-a11c-a83422e43b73@googlegroups.com> References: <795caf9e-271d-4483-a11c-a83422e43b73@googlegroups.com> Message-ID: <20170703001423.GA64678@cskk.homeip.net> On 02Jul2017 11:02, Andrew Z wrote: > I'd appreciate your suggestions for a better approach to the following task. > >I have 2 files ( 2 classes). One (ClassA) has all logic related to the main workflow of the program. Another (DB), I like to offload all operations with a DB ( sql3 in this case). > >I'm trying to pass the connection to the main class, but having problems. One of them, is i can't pass the conn as a parameter to the function in one (ClassA.abc()), because i inherit it ( function abc() ). >I created a self.DBConnection field, but i'm not sure if i'm on the right path... >Code is gutted to highlight the problem. Unfortunately you have gutted the "writeTicks" method, making it harder to see your intent. You separation is ok, but normally one would try to entire conceal the unerlying db connection (the sqlite3 connection) from the man class. So you wouldn't "pass the connection to the main class". Normally I would have your DB class represent an open (or openable, if you wanted to defer that) database connection. So your main class would go: def __init__(self, ...other args...): self.db = DB(location="blah.sqlite") def abc(self, reqId: int): self.db.writeTicks(reqId) I wouldn't be passing in "self" (your ClassA instance) or self.DBconnection at all. You'd only pass "self" if the "DB" instance needed more information from ClassA; normally you'd just pass that information to writeTicks() along with reqId, so that the DB needs no special knowledge about ClassA. I've also got a bunch of fine grained remarks about your code that you can take or leave as you see fit: >one.py: >from .DB import * Try to avoid importing "*". It sucks all the names from "DB" into your own namespace. Arguably you only need the DB class itself - all the other functionality comes with it as methods on the class. So: from DB import DB >class ClassA(OtherObject): > def __init__(self): > self.DBConnection = sql3.Connection It isn't obvious why you need this. In my example above I'd just make a DB instance and save it as self.db; unless you're controlling different backends that would be all you need. > def abc(self, reqId: int): > DB.writeTicks(self,self.DBConnection,reqId)) Here you're calling the writeTicks method on the DB class itself. I wouldn't be making that a class method; I'd make it an instance method on a DB instance, so: self.db.writeTicks(reqId) unless there's more to writeTicks (which you've left out). >DB.py: Try not to name modules that same as their classes - it leads to confusion. I'd call it "db.py" and make the earlier import: from db import DB >import sqlite3 as sql3 This feels like an pointless abbreviation. [...] >class DB(object): > db_location = '' > # db_location = '../DB/pairs.db' db_location appears to be a class default. These are normally treats as one would a "constant" in other languages. Stylisticly, this normally means you'd write the name in upper case, eg: DEFAULT_DB_LOCATION = '../DB/pairs.db' > def __init__(self, location='../DB/pairs.db'): > db_location = location And using that would normally look like this: def __init__(self, location=None): if location is None: location = self.DEFAULT_DB_LOCATION > print(current_fn_name(),' self.db_location = {}'.format(db_location)) > try: > with open(db_location) as file: > pass > except IOError as e: > print("Unable to locate the Db @ {}".format(db_location)) I'd just os.path.exists(db_location) myself, or outright make the db connection immediately. Also, and this actually is important, error messages should got the the program's standard error output (or to some logging system). So your print would look like: print("Unable to locate the Db @ {}".format(db_location), file=sys.stderr) Also, normal Python practie here would not be to issue an error message, but to raise an exception. That way the caller gets to see the problem, and also the caller cannot accidentally start other work in the false belief that the DB instance has been made successfully. So better would be: raise ValueError("Unable to locate the Db @ {}".format(db_location)) > def reqConnection(self): > try: > con = sql3.connect(self.db_location) > con.text_factory = str > except sql3.Error as e: > print("Error %s:".format( e.args[0])) > sys.exit(1) It is generally bad for a class method (or, usually, any funtion) to abort the program. Raise an exception; that way (a) the caller gets to see the actual cause of the problem and (b) the caller can decide to abort or try to recover and (c) if the caller does nothing the program will abort on its own, doing this for free. Effectively you have embedded "polciy" inside your reqConnection method, generally unwelcome - it removes the ability for the caller to implement their own policy. And that is an architectural thing (where the policy lies). > return con The easy way to raise the exception here is just to not try/except at all, thus: def reqConnection(self): return sql3.connect(self.db_location) or if you really need that text_factory: def reqConnection(self): con = sql3.connect(self.db_location) con.text_factory = str return con > def write(self, con : sql3.Connection, tickId: int): > con.execute( blah) However I'd make the connection a singleton attribute of the DB class. So I'd usually have __init__ make the connection immediately (which saves you having to "probe" the location: def __init__(self, ...): ... self.con = sql3.connect(self.db_location) and then write() would go: def write(self, tickId: int): self.con.execute(blah) and as you can see that _removes_ any need to pass the connection back to the caller - you don't need to expose an reqConnection method at all, or manage it in the caller. Instead, ClassA can just store the DB instance itself, and let DB look after all the specifics. That is exactly the kind of thing class encapsulation is meant to achieve: the caller (Class A) can wash its hands of the mechanisms, which are not its problem. Cheers, Cameron Simpson From formisc at gmail.com Sun Jul 2 21:26:30 2017 From: formisc at gmail.com (Andrew Zyman) Date: Sun, 2 Jul 2017 21:26:30 -0400 Subject: Proper architecture In-Reply-To: <20170703001423.GA64678@cskk.homeip.net> References: <795caf9e-271d-4483-a11c-a83422e43b73@googlegroups.com> <20170703001423.GA64678@cskk.homeip.net> Message-ID: Cameron, This is much more than I hoped for. >From quickly looking over - most your notes are perfectly on target. Allow sometime to digest and reply. Thank you very much! On 2 Jul 2017 8:14 p.m., "Cameron Simpson" wrote: > On 02Jul2017 11:02, Andrew Z wrote: > >> I'd appreciate your suggestions for a better approach to the following >> task. >> >> I have 2 files ( 2 classes). One (ClassA) has all logic related to the >> main workflow of the program. Another (DB), I like to offload all >> operations with a DB ( sql3 in this case). >> >> I'm trying to pass the connection to the main class, but having problems. >> One of them, is i can't pass the conn as a parameter to the function in one >> (ClassA.abc()), because i inherit it ( function abc() ). >> I created a self.DBConnection field, but i'm not sure if i'm on the right >> path... >> Code is gutted to highlight the problem. >> > > Unfortunately you have gutted the "writeTicks" method, making it harder to > see your intent. > > You separation is ok, but normally one would try to entire conceal the > unerlying db connection (the sqlite3 connection) from the man class. So you > wouldn't "pass the connection to the main class". > > Normally I would have your DB class represent an open (or openable, if you > wanted to defer that) database connection. So your main class would go: > > def __init__(self, ...other args...): > self.db = DB(location="blah.sqlite") > > def abc(self, reqId: int): > self.db.writeTicks(reqId) > > I wouldn't be passing in "self" (your ClassA instance) or > self.DBconnection at all. You'd only pass "self" if the "DB" instance > needed more information from ClassA; normally you'd just pass that > information to writeTicks() along with reqId, so that the DB needs no > special knowledge about ClassA. > > I've also got a bunch of fine grained remarks about your code that you can > take or leave as you see fit: > > one.py: >> from .DB import * >> > > Try to avoid importing "*". It sucks all the names from "DB" into your own > namespace. Arguably you only need the DB class itself - all the other > functionality comes with it as methods on the class. So: > > from DB import DB > > class ClassA(OtherObject): >> def __init__(self): >> self.DBConnection = sql3.Connection >> > > It isn't obvious why you need this. In my example above I'd just make a DB > instance and save it as self.db; unless you're controlling different > backends that would be all you need. > > def abc(self, reqId: int): >> DB.writeTicks(self,self.DBConnection,reqId)) >> > > Here you're calling the writeTicks method on the DB class itself. I > wouldn't be making that a class method; I'd make it an instance method on a > DB instance, so: > > self.db.writeTicks(reqId) > > unless there's more to writeTicks (which you've left out). > > DB.py: >> > > Try not to name modules that same as their classes - it leads to > confusion. I'd call it "db.py" and make the earlier import: > > from db import DB > > import sqlite3 as sql3 >> > > This feels like an pointless abbreviation. > > [...] > >> class DB(object): >> db_location = '' >> # db_location = '../DB/pairs.db' >> > > db_location appears to be a class default. These are normally treats as > one would a "constant" in other languages. Stylisticly, this normally means > you'd write the name in upper case, eg: > > DEFAULT_DB_LOCATION = '../DB/pairs.db' > > def __init__(self, location='../DB/pairs.db'): >> db_location = location >> > > And using that would normally look like this: > > def __init__(self, location=None): > if location is None: > location = self.DEFAULT_DB_LOCATION > > print(current_fn_name(),' self.db_location = >> {}'.format(db_location)) >> try: >> with open(db_location) as file: >> pass >> except IOError as e: >> print("Unable to locate the Db @ >> {}".format(db_location)) >> > > I'd just os.path.exists(db_location) myself, or outright make the db > connection immediately. > > Also, and this actually is important, error messages should got the the > program's standard error output (or to some logging system). So your print > would look like: > > print("Unable to locate the Db @ {}".format(db_location), > file=sys.stderr) > > Also, normal Python practie here would not be to issue an error message, > but to raise an exception. That way the caller gets to see the problem, and > also the caller cannot accidentally start other work in the false belief > that the DB instance has been made successfully. So better would be: > > raise ValueError("Unable to locate the Db @ {}".format(db_location)) > > def reqConnection(self): >> try: >> con = sql3.connect(self.db_location) >> con.text_factory = str >> except sql3.Error as e: >> print("Error %s:".format( e.args[0])) >> sys.exit(1) >> > > It is generally bad for a class method (or, usually, any funtion) to abort > the program. Raise an exception; that way (a) the caller gets to see the > actual cause of the problem and (b) the caller can decide to abort or try > to recover and (c) if the caller does nothing the program will abort on its > own, doing this for free. > > Effectively you have embedded "polciy" inside your reqConnection method, > generally unwelcome - it removes the ability for the caller to implement > their own policy. And that is an architectural thing (where the policy > lies). > > return con >> > > The easy way to raise the exception here is just to not try/except at all, > thus: > > def reqConnection(self): > return sql3.connect(self.db_location) > > or if you really need that text_factory: > > def reqConnection(self): > con = sql3.connect(self.db_location) > con.text_factory = str > return con > > def write(self, con : sql3.Connection, tickId: int): >> con.execute( blah) >> > > However I'd make the connection a singleton attribute of the DB class. So > I'd usually have __init__ make the connection immediately (which saves you > having to "probe" the location: > > def __init__(self, ...): > ... > self.con = sql3.connect(self.db_location) > > and then write() would go: > > def write(self, tickId: int): > self.con.execute(blah) > > and as you can see that _removes_ any need to pass the connection back to > the caller - you don't need to expose an reqConnection method at all, or > manage it in the caller. Instead, ClassA can just store the DB instance > itself, and let DB look after all the specifics. That is exactly the kind > of thing class encapsulation is meant to achieve: the caller (Class A) can > wash its hands of the mechanisms, which are not its problem. > > Cheers, > Cameron Simpson > From steve+python at pearwood.info Sun Jul 2 23:57:45 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Mon, 03 Jul 2017 13:57:45 +1000 Subject: Teaching References: Message-ID: <5959c0ba$0$1595$c3e8da3$5496439d@news.astraweb.com> On Mon, 3 Jul 2017 03:45 am, Mark- wrote: > admin at python.org wrote: > >> Irv Kalb: >> > I teach Python at two colleges in Silicon Valley. >> >> >> and I don't give a fuck about that. > > Wow, some of you folks are so civilized. It's one person, a troll who is sending abusive and racist messages using forged sender addresses under other people's names. He thinks that makes him cool and edgy. You know, like a five year old who has had one karate lesson and then goes around karate chopping everyone and thinking he's Bruce Lee. He also has a habit of forgetting which persona he is pretending to be, so he'll send a message as Person X and then abuse Person X in that very message. So he's not just a jerk, but a dim witted jerk. Now that I've said this, if you start seeing a bunch of abuse supposedly coming from me (especially if it is badly written and childish), you know its coming from this troll. Just killfile messages coming from NetFront and you should see a huge improvement. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From __peter__ at web.de Mon Jul 3 03:26:19 2017 From: __peter__ at web.de (Peter Otten) Date: Mon, 03 Jul 2017 09:26:19 +0200 Subject: Problem getting unittest tests for existing project working References: Message-ID: Aaron Gray wrote: > I am trying to get distorm3's unittests working but to no avail. > > I am not really a Python programmer so was hoping someone in the know > maybe able to fix this for me. Normally it doesn't work that way... > > Here's a GitHub issue I have created for the bug :- > > https://github.com/gdabah/distorm/issues/118 ...though this time it does: diff --git a/examples/tests/test_distorm3.py b/examples/tests/test_distorm3.py index aec1d63..babaacc 100644 --- a/examples/tests/test_distorm3.py +++ b/examples/tests/test_distorm3.py @@ -43,11 +43,18 @@ def Assemble(text, mode): mode = "amd64" else: mode = "x86" - os.system("yasm.exe -m%s 1.asm" % mode) + os.system("yasm -m%s 1.asm" % mode) return open("1", "rb").read() -class InstBin(unittest.TestCase): +class NoTest(unittest.TestCase): + def __init__(self): + unittest.TestCase.__init__(self, "test_dummy") + def test_dummy(self): + self.fail("dummy") + +class InstBin(NoTest): def __init__(self, bin, mode): + NoTest.__init__(self) bin = bin.decode("hex") #fbin[mode].write(bin) self.insts = Decompose(0, bin, mode) @@ -61,8 +68,9 @@ class InstBin(unittest.TestCase): self.assertNotEqual(self.inst.rawFlags, 65535) self.assertEqual(self.insts[instNo].mnemonic, mnemonic) -class Inst(unittest.TestCase): +class Inst(NoTest): def __init__(self, instText, mode, instNo, features): + NoTest.__init__(self) modeSize = [16, 32, 64][mode] bin = Assemble(instText, modeSize) #print map(lambda x: hex(ord(x)), bin) Notes: (1) This is a hack; the original code was probably written against an older version of unittest and is abusing the framework to some extent. The dummy tests I introduced are ugly but should do no harm. (2) I tried this on linux only and thus changed 'yasm.exe' to 'yasm' unconditionally. From robin at reportlab.com Mon Jul 3 05:22:40 2017 From: robin at reportlab.com (Robin Becker) Date: Mon, 3 Jul 2017 10:22:40 +0100 Subject: cgitb weirdness between apache and nginx Message-ID: <1f1108e1-acaa-91df-71e8-12463f70ea1e@chamonix.reportlab.co.uk> I have a simple cgi application and want to use cgitb as a catchall way of getting an error response when things go wrong I did development with nginx + fastcgi + fsgiwrap and the cgitb failure output appears to work fine. The end user wants to use apache 2.4 and there things go wrong. Instead of the cgitb traceback html I am seeing a generic apache server error page. After some serious debugging with strace I find that the cgitb is starting to write, but it appears that it always starts writing something containing the word spam eg --> --> this is from the function reset in cgitb. My application has not written anything at the point this appears and it seems that apache just considers this an erroneous output. In order to make apache behave it seems I must start stdout before an error has occurred. That appears silly when I don't know when an error will occur. Is there a way to get cgitb to check whether anything has been written to stdout when it starts up? Or is there some standard way to initialize stdout so that cgitb will not force apache to the wrong conclusion? -- Robin Becker From mal at europython.eu Mon Jul 3 07:37:11 2017 From: mal at europython.eu (M.-A. Lemburg) Date: Mon, 3 Jul 2017 13:37:11 +0200 Subject: EuroPython 2017: Day tickets available Message-ID: <8dccfbe2-c54f-e5d7-e39a-4085527cae3d@europython.eu> We have now opened ticket sales for day tickets to EuroPython 2017 from July 9-17 in Rimini. * EuroPython 2017 Day Tickets * https://ep2017.europython.eu/en/registration/ These day passes can be bought online and are valid for the day you pick up your badge. We have again tried to make these as affordable as possible for students, pupils and postdocs: * Student day ticket: EUR 55.00 incl. 22% VAT (only available for pupils, students and postdoctoral researchers; please bring your student card or declaration from University, stating your affiliation, starting and end dates of your contract) * Personal day ticket: EUR 148.00 incl. 22% VAT (for people enjoying Python from home) * Business day ticket: EUR 215.00 excl. VAT, EUR 262.30 incl. 22% VAT (for people using Python to make a living) Full conference tickets (valid for all 8 days) at the on-desk rate are available as well, but we are no longer selling student tickets: * Personal full ticket: EUR 490.00 incl. 22% VAT (for people enjoying Python from home, including students, postdocs, etc.) * Business full ticket: EUR 720.00 excl. VAT, EUR 878.40 incl. 22% VAT (for people using Python to make a living) Please also remember to get your social event ticket for Thursday, July 13. This is not included in the above conference tickets: * EuroPython Social Event: EUR 25.00 incl. 10% VAT per person Please see our registration page for more details. Enjoy, -- EuroPython 2017 Team http://ep2017.europython.eu/ http://www.europython-society.org/ PS: Please forward or retweet to help us reach all interested parties: https://twitter.com/europython/status/881837737879449601 Thanks. From wanderer at dialup4less.com Mon Jul 3 09:30:05 2017 From: wanderer at dialup4less.com (Wanderer) Date: Mon, 3 Jul 2017 06:30:05 -0700 (PDT) Subject: Script to ban authors from Google Groups Message-ID: I use this script to ban authors from Google Groups. You need to create a banned authors text file with each author separated by a new line. For Mozilla you need to compile it to a pyc file, associate pyc files with Python and create a bookmark. You then use the bookmark to enter google groups web page. # remove banned author and authors with mostly caps # to compile to pyc #>>>import py_compile #>>>py_compile.compile("file.py") import urllib2 import webbrowser import os from bs4 import BeautifulSoup PALEMOON = 'Mozilla/5.0 (Windows NT 6.1; WOW64) KHTML/4.11 Gecko/20130308 Firefox/33.0 (PaleMoon/25.2)' WATERFOX = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:40.0) Gecko/20100101 Firefox/51.1.0 Waterfox/51.1.0' USERAGENTBASE = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:40.0) Gecko/20100101 ' BROWSERPATH = 'C:\\"Program Files"\\Waterfox\\waterfox.exe' FILENAME = 'C:\\PyStuff\\pygroup.htm' WEBPAGE = "https://groups.google.com/forum/?_escaped_fragment_=forum/comp.lang.python%5B1-50%5D" BANNED_AUTHORS_FILE = 'C:\\PyStuff\\bannedAuthors.txt' def getUserAgentVersion(): """ get the useragent version returns agentVersion -- user agent version in format Firefox/51.0.1 Waterfox/51.0.1 """ bvers = os.popen(BROWSERPATH + " -v").read() bversList = bvers.split() agentVersion = 'Firefox/' + bversList[2] + ' ' + bversList[1] + '/' + bversList[2] return agentVersion def getwebpage(url): """ Open a webpage url -- the url to the webpage returns page -- the source for the webpage """ user_agent = USERAGENTBASE + getUserAgentVersion() headers = { 'User-Agent' : user_agent } req = urllib2.Request(url, None, headers) response = urllib2.urlopen(req) page = response.read() return page def getBannedAuthors(): """ Convert the banned authors text file into a list returns bannedAuthors -- list of banned author strings """ f = open(BANNED_AUTHORS_FILE, 'r') bannedAuthors = f.read().split('\n') f.close() return bannedAuthors def removeBadAuthors(html_doc): """ Remove posts from google group by authors that are mostly caps or on the Banned List html_doc -- an html document """ bannedAuthors = getBannedAuthors() print bannedAuthors soup = BeautifulSoup(html_doc) #print soup.prettify() post = soup.find("tr") while post is not None: author = post.find("td", "author") aname = author.get_text() if author is None: print "Author is None" oldpost = post post = oldpost.find_next_sibling('tr') oldpost.decompose() elif aname in bannedAuthors: print "Author is Banned" oldpost = post post = oldpost.find_next_sibling('tr') oldpost.decompose() else: print author numCaps = 1.0 * sum(1 for c in aname if c.isupper()) ratio = numCaps/(1.0*len(aname)) print ratio oldpost = post post = oldpost.find_next_sibling('tr') if ratio > 0.7: oldpost.decompose() print "BIG" if post is None: print "Post is NONE" f = open(FILENAME, 'w') f.write(soup.prettify().encode('ascii', 'ignore')) f.close() def main(): html_doc = getwebpage(WEBPAGE) removeBadAuthors(html_doc) webbrowser.open(FILENAME) print 'done' if __name__ == "__main__": main() From irmen.NOSPAM at xs4all.nl Mon Jul 3 13:19:47 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Mon, 3 Jul 2017 19:19:47 +0200 Subject: pythonhosted.org status? In-Reply-To: References: <5958b6ce$0$774$e4fe514c@news.xs4all.nl> Message-ID: <595a7cb3$0$789$e4fe514c@news.xs4all.nl> On 02/07/2017 11:27, breamoreboy at gmail.com wrote: > On Sunday, July 2, 2017 at 10:03:34 AM UTC+1, Irmen de Jong wrote: >> Hi, >> I'm using pythonhosted.org to host the docs for various projects but it has either been >> very slow or unavailable over the past week. Anyone else having the same problems? >> Should I perhaps consider putting my docs on readthedocs.org instead? >> >> Irmen > > I get:- > > "Service Unavailable > > The service is temporarily unavailable. Please try again later." > > http://downforeveryoneorjustme.com says it's down. Thanks, I forgot about that site. Quite handy for things like this. I've decided to mirror my docs on readthedocs.org, an added bonus is their automated documentation build system that relieves me from having to create, zip and upload the docs myself. > > Kindest regards. > > Mark Lawrence. > Cheers Irmen From garyfallidis at gmail.com Mon Jul 3 15:24:06 2017 From: garyfallidis at gmail.com (Eleftherios Garyfallidis) Date: Mon, 03 Jul 2017 19:24:06 +0000 Subject: ANN: DIPY 0.12.0 release Message-ID: We are excited to announce a new public release of Diffusion Imaging in Python (DIPY). DIPY 0.12 (Tuesday, 26 June 2017) This release received contributions from 48 developers (the full release notes are at: http://nipy.org/dipy/release0.12.html) Highlights of this release include: - IVIM Simultaneous modeling of perfusion and diffusion. - MAPL, tissue microstructure estimation using Laplacian-regularized MAP-MRI. - DKI-based microstructural modelling. - Free water diffusion tensor imaging. - Denoising using Local PCA. - Streamline-based registration (SLR). - Fiber to bundle coherence (FBC) measures. - Bayesian MRF-based tissue classification. - New API for integrated user interfaces. - New hdf5 file (.pam5) for saving reconstruction results. - Interactive slicing of images, ODFs and peaks. - Updated API to support latest numpy versions. - New system for automatically generating command line interfaces. - Faster computation of cross correlation for image registration. To upgrade, run the following command in your terminal: pip install --upgrade dipy or conda install -c conda-forge dipy This version of DIPY depends on the latest version of nibabel (2.1.0). For any questions go to http://dipy.org, or send an e-mail to neuroimaging at python.org We also have an instant messaging service and chat room available at https://gitter.im/nipy/dipy On behalf of the DIPY developers, Eleftherios Garyfallidis, Ariel Rokem, Serge Koudoro http://dipy.org/developers.html From saxri89 at gmail.com Mon Jul 3 15:47:12 2017 From: saxri89 at gmail.com (Xristos Xristoou) Date: Mon, 3 Jul 2017 12:47:12 -0700 (PDT) Subject: python script is slowly after use multiprocessing Message-ID: i have create an image processing python function. my system have 4 cores + 4 threads. i want to use multiprocessing to speed up my function,but anytime to use multiprocessing packages my function is not faster and is 1 minute slowly. any idea why ?first time use multiprocessing packages. main function : if __name__ == '__main__': in_path="C:/Users/username/Desktop/in.tif" out_path="C:/Users/username/Desktop/out.tif" myfun(in_path, out_path) time=3.4 minutes with multiprocessing map : if __name__ == '__main__': p = Pool(processes=4) in_path="C:/Users/username/Desktop/in.tif" out_path="C:/Users/username/Desktop/out.tif" result = p.map(myfun(in_path,out_path)) time=4.4 minutes if __name__ == '__main__': pool = multiprocessing.Pool(4) in_path="C:/Users/username/Desktop/in.tif" out_path="C:/Users/username/Desktop/out.tif" pool.apply_async(myfun, args=(in_path,out_path,)) pool.close() pool.join() time=4.5 minutes From python at mrabarnett.plus.com Mon Jul 3 16:40:49 2017 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 3 Jul 2017 21:40:49 +0100 Subject: python script is slowly after use multiprocessing In-Reply-To: References: Message-ID: <1b1519e7-72de-dcdc-c7d9-c64239db3416@mrabarnett.plus.com> On 2017-07-03 20:47, Xristos Xristoou wrote: > i have create an image processing python function. > > my system have 4 cores + 4 threads. > > i want to use multiprocessing to speed up my function,but anytime to use multiprocessing packages my function is not faster and is 1 minute slowly. any idea why ?first time use multiprocessing packages. > > main function : > > if __name__ == '__main__': > in_path="C:/Users/username/Desktop/in.tif" > out_path="C:/Users/username/Desktop/out.tif" > myfun(in_path, out_path) > time=3.4 minutes > > with multiprocessing map : > > if __name__ == '__main__': > p = Pool(processes=4) > in_path="C:/Users/username/Desktop/in.tif" > out_path="C:/Users/username/Desktop/out.tif" > result = p.map(myfun(in_path,out_path)) > time=4.4 minutes > > if __name__ == '__main__': > pool = multiprocessing.Pool(4) > in_path="C:/Users/username/Desktop/in.tif" > out_path="C:/Users/username/Desktop/out.tif" > pool.apply_async(myfun, args=(in_path,out_path,)) > pool.close() > pool.join() > time=4.5 minutes > It looks like you have one function performing one task. It's not splitting the task into multiple parts that can be run in parallel on multiple cores. From ben.usenet at bsb.me.uk Mon Jul 3 16:44:35 2017 From: ben.usenet at bsb.me.uk (Ben Bacarisse) Date: Mon, 03 Jul 2017 21:44:35 +0100 Subject: python script is slowly after use multiprocessing References: Message-ID: <87h8yt5jyk.fsf@bsb.me.uk> Xristos Xristoou writes: > i have create an image processing python function. > > my system have 4 cores + 4 threads. > > i want to use multiprocessing to speed up my function,but anytime to > use multiprocessing packages my function is not faster and is 1 minute > slowly. any idea why ?first time use multiprocessing packages. > > main function : > > if __name__ == '__main__': > in_path="C:/Users/username/Desktop/in.tif" > out_path="C:/Users/username/Desktop/out.tif" > myfun(in_path, out_path) > time=3.4 minutes > > with multiprocessing map : > > if __name__ == '__main__': > p = Pool(processes=4) > in_path="C:/Users/username/Desktop/in.tif" > out_path="C:/Users/username/Desktop/out.tif" > result = p.map(myfun(in_path,out_path)) Don't you get an error telling you that map needs another argument? map is intended to be used like this: p.map(f, [1, 2, 3]) the function is applied to the elements drawn from the iterable second argument. Each application runs in a separate process form the pool and the results are combined together into a list result. So you could process multiple file in parallel using this method but it can't run your function on that one input file in parallel. -- Ben. From ofekmeister at gmail.com Mon Jul 3 22:36:14 2017 From: ofekmeister at gmail.com (ofekmeister at gmail.com) Date: Mon, 3 Jul 2017 19:36:14 -0700 (PDT) Subject: Privy: An easy, fast lib to password-protect your data Message-ID: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> https://github.com/ofek/privy From maylinge0903 at gmail.com Tue Jul 4 05:01:08 2017 From: maylinge0903 at gmail.com (Mayling ge) Date: Tue, 4 Jul 2017 17:01:08 +0800 Subject: memory leak with re.match Message-ID: Hi, My function is in the following way to handle file line by line. There are multiple error patterns defined and need to apply to each line. I use multiprocessing.Pool to handle the file in block. The memory usage increases to 2G for a 1G file. And stays in 2G even after the file processing. File closed in the end. If I comment out the call to re_pat.match, memory usage is normal and keeps under 100Mb. am I using re in a wrong way? I cannot figure out a way to fix the memory leak. And I googled . def line_match(lines, errors) for error in errors: try: re_pat = re.compile(error['pattern']) except Exception: print_error continue for line in lines: m = re_pat.match(line) # other code to handle matched object def process_large_file(fo): p = multiprocessing.Pool() while True: lines = list(itertools.islice(fo, line_per_proc)) if not lines: break result = p.apply_async(line_match, args=(errors, lines)) Notes: I omit some code as I think the significant difference is with/without re_pat.match(...) Regards, -Meiling From mal at europython.eu Tue Jul 4 10:22:41 2017 From: mal at europython.eu (M.-A. Lemburg) Date: Tue, 4 Jul 2017 16:22:41 +0200 Subject: EuroPython 2017: Free Intel Distribution for Python Message-ID: We are very pleased to have Intel as Diamond Sponsor for EuroPython 2017. You can visit them at the most central booth in our exhibit area, the Sala della Piazza, and take the opportunity to chat with their staff. Please find below a hosted blog post from Intel, that offers us an exciting glimpse at the recently released free, Intel? Distribution for Python: http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for Enjoy, -- EuroPython 2017 Team http://ep2017.europython.eu/ http://www.europython-society.org/ From steve+python at pearwood.info Tue Jul 4 11:40:14 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 05 Jul 2017 01:40:14 +1000 Subject: Spammy spam spam spam spam References: <9a6c065d-eca2-407b-8c0e-7d57e9feba65@googlegroups.com> Message-ID: <595bb6e0$0$1600$c3e8da3$5496439d@news.astraweb.com> On Tue, 4 Jul 2017 10:55 pm, Case Solution & Analysis wrote: > Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace > (at) by @ and (dot) by . Since we don't yet have a protocol for transmitting a punch to the face over TCP/IP, is it be wrong of me to wish that some white knight hacker would DDOS these spammy bastards until their supposed business goes broke? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From mail at timgolden.me.uk Tue Jul 4 11:50:40 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Tue, 4 Jul 2017 16:50:40 +0100 Subject: Spammy spam spam spam spam In-Reply-To: <595bb6e0$0$1600$c3e8da3$5496439d@news.astraweb.com> References: <9a6c065d-eca2-407b-8c0e-7d57e9feba65@googlegroups.com> <595bb6e0$0$1600$c3e8da3$5496439d@news.astraweb.com> Message-ID: <5c4261bb-ed87-3c95-00dc-2a6c369b92b2@timgolden.me.uk> On 04/07/2017 16:40, Steve D'Aprano wrote: > On Tue, 4 Jul 2017 10:55 pm, Case Solution & Analysis wrote: > >> Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace >> (at) by @ and (dot) by . > > Since we don't yet have a protocol for transmitting a punch to the face over > TCP/IP, is it be wrong of me to wish that some white knight hacker would DDOS > these spammy bastards until their supposed business goes broke? At risk of annoying you further... we've been filtering them from the mailing list for a while now. TJG From saxri89 at gmail.com Tue Jul 4 13:24:22 2017 From: saxri89 at gmail.com (Xristos Xristoou) Date: Tue, 4 Jul 2017 10:24:22 -0700 (PDT) Subject: python script is slowly after use multiprocessing In-Reply-To: References: Message-ID: @MRAB tell me your proposal for this ? @Ben Bacarisse i dont get some error,i have wrong map ? From python at mrabarnett.plus.com Tue Jul 4 14:12:14 2017 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 4 Jul 2017 19:12:14 +0100 Subject: python script is slowly after use multiprocessing In-Reply-To: References: Message-ID: On 2017-07-04 18:24, Xristos Xristoou wrote: > @MRAB tell me your proposal for this ? I don't have any suggestions because you haven't given any details about the function. > @Ben Bacarisse i dont get some error,i have wrong map ? > That code will call the function and then try to pass its result to .map, at which point it will complain about the missing argument. From tjreedy at udel.edu Tue Jul 4 15:54:19 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 15:54:19 -0400 Subject: If you like Python, star Github python [forward from pydev] Message-ID: On pydev list, 2017-06-30 15:59 GMT+02:00 Victor Stinner : > GitHub has a showcase page of hosted programming languages: > https://github.com/showcases/programming-languages > Python is only #11 with 8,539 stars, behind PHP and Ruby! > Hey, you should "like" ("star"?) the CPython project if you like Python! > https://github.com/python/cpython/ > Click on "Star" at the top right. and >4 days later, we got +2,389 new stars, thank you! (8,539 => 10,928) >Python moved from the 11th place to the 9th, before Elixir and Julia. >Python is still behind Ruby (12,511) and PHP (12,318), but it's already much better than before! Someone else posted to reddit. -- Terry Jan Reedy From tjreedy at udel.edu Tue Jul 4 18:05:45 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 4 Jul 2017 18:05:45 -0400 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: References: Message-ID: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: > We are very pleased to have Intel as Diamond Sponsor for EuroPython > 2017. You can visit them at the most central booth in our exhibit > area, the Sala della Piazza, and take the opportunity to chat with > their staff. > > Please find below a hosted blog post from Intel, that offers us an > exciting glimpse at the recently released free, Intel? Distribution > for Python: > > http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for I looked but did not find the most important thing. What version of Python? Also, if 3.6 rather than 2.7, do they plan to keep up to date? Free for how long? 90 days? Until further notice? Indefinitely? -- Terry Jan Reedy From rosuav at gmail.com Tue Jul 4 18:21:12 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Jul 2017 08:21:12 +1000 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> Message-ID: On Wed, Jul 5, 2017 at 8:05 AM, Terry Reedy wrote: > On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: >> >> We are very pleased to have Intel as Diamond Sponsor for EuroPython >> 2017. You can visit them at the most central booth in our exhibit >> area, the Sala della Piazza, and take the opportunity to chat with >> their staff. >> >> Please find below a hosted blog post from Intel, that offers us an >> exciting glimpse at the recently released free, Intel? Distribution >> for Python: >> >> >> http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for > > > I looked but did not find the most important thing. > What version of Python? > > Also, if 3.6 rather than 2.7, do they plan to keep up to date? > Free for how long? 90 days? Until further notice? Indefinitely? I think that page is basically just a teaser to say "come to our booth". They want people to ask those questions in person. Sucks for those of us who aren't within a thousand miles of EuroPython. ChrisA From cs at zip.com.au Tue Jul 4 18:46:08 2017 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 5 Jul 2017 08:46:08 +1000 Subject: memory leak with re.match In-Reply-To: References: Message-ID: <20170704224608.GA56706@cskk.homeip.net> On 04Jul2017 17:01, Mayling ge wrote: > My function is in the following way to handle file line by line. There are > multiple error patterns defined and need to apply to each line. I use > multiprocessing.Pool to handle the file in block. > > The memory usage increases to 2G for a 1G file. And stays in 2G even after > the file processing. File closed in the end. > > If I comment out the call to re_pat.match, memory usage is normal and > keeps under 100Mb. [...] > > def line_match(lines, errors) > for error in errors: > try: > re_pat = re.compile(error['pattern']) > except Exception: > print_error > continue > for line in lines: > m = re_pat.match(line) > # other code to handle matched object [...] > Notes: I omit some code as I think the significant difference is > with/without re_pat.match(...) Hmm. Does the handling code (omitted) keep the line or match object in memory? If leaving out the "m = re_pat.match(line)" triggers the leak, and presuming that line itself doesn't leak, then I would start to suspect the handling code is not letting go of the match object "m" or of the line (which is probably attached to the match object "m" to support things like m.group() and so forth). So you might need to show us the handling code. Cheers, Cameron Simpson From greg.ewing at canterbury.ac.nz Tue Jul 4 18:46:09 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 05 Jul 2017 10:46:09 +1200 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> Message-ID: Chris Angelico wrote: > On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: >> I looked but did not find the most important thing. >> What version of Python? > > I think that page is basically just a teaser to say "come to our > booth". Google found me a (sort of) download page: https://software.seek.intel.com/python-distribution I say "sort of" because you have to register and agree to let them spam you before they'll give you anything. It does say: Operating systems: Windows* 7 or later, macOS, and Linux Python* versions: 2.7.X, 3.5.X, 3.6 It doesn't say whether source is included. -- Greg From python at mrabarnett.plus.com Tue Jul 4 18:49:41 2017 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 4 Jul 2017 23:49:41 +0100 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> Message-ID: <3b19ae5b-cff1-1285-648a-06d93d285a43@mrabarnett.plus.com> On 2017-07-04 23:05, Terry Reedy wrote: > On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: >> We are very pleased to have Intel as Diamond Sponsor for EuroPython >> 2017. You can visit them at the most central booth in our exhibit >> area, the Sala della Piazza, and take the opportunity to chat with >> their staff. >> >> Please find below a hosted blog post from Intel, that offers us an >> exciting glimpse at the recently released free, Intel? Distribution >> for Python: >> >> http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for > > I looked but did not find the most important thing. > What version of Python? > From a brief search it appears to be Python 2.7 and Python 3.5. > Also, if 3.6 rather than 2.7, do they plan to keep up to date? > Free for how long? 90 days? Until further notice? Indefinitely? > From ofekmeister at gmail.com Tue Jul 4 20:39:17 2017 From: ofekmeister at gmail.com (ofekmeister at gmail.com) Date: Tue, 4 Jul 2017 17:39:17 -0700 (PDT) Subject: Privy: An easy, fast lib to password-protect your data In-Reply-To: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> References: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> Message-ID: <3d9353da-01b0-4de5-b16a-d067d20f9487@googlegroups.com> On Monday, July 3, 2017 at 10:36:39 PM UTC-4, ofekm... at gmail.com wrote: > https://github.com/ofek/privy Bump b/c spam From flebber.crue at gmail.com Wed Jul 5 02:50:42 2017 From: flebber.crue at gmail.com (Sayth Renshaw) Date: Tue, 4 Jul 2017 23:50:42 -0700 (PDT) Subject: Generator - Provide parameters to url - requests Message-ID: <328f42bb-ba23-4199-9f3a-9ec1829bc448@googlegroups.com> Hi I am struggling to figure out how I can create a generator to provide values to my url. My url needs to insert the year month and day in the url not as params to the url. import json import requests import datetime # using this I can create a list of dates for the first 210 days of this year. base = datetime.datetime(2017,1,1) def datesRange(max, min): date_list = (base - datetime.timedelta(days=x) for x in range(max,min)) DAY = date_list.day MONTH = date_list.month YEAR = date_list.year yield DAY, MONTH, YEAR dateValues = datesRange(-210,0) def create_url(day, month, year): https://api.tatts.com/sales/vmax/web/data/racing/{0}/{1}/{2}/sr/full".format(YEAR,MONTH,DAY) return url Then I want to insert them in this url one at a time from the generator try: r = requests.get(INSERT_URL_HERE) if r.status_code == requests.codes.ok then: # open a file for writing using url paramters with open(SR + DAY + MONTH + YEAR + '.json', 'w') as f: # Do stuff from here not relevant to question. I have just gotten lost. Is there an easier way to go about this? Cheers Sayth From maylinge0903 at gmail.com Wed Jul 5 03:04:52 2017 From: maylinge0903 at gmail.com (Mayling ge) Date: Wed, 5 Jul 2017 15:04:52 +0800 Subject: memory leak with re.match In-Reply-To: <20170704224608.GA56706@cskk.homeip.net> References: Message-ID: Thanks. I actually comment out all handling code. The loop ends with the re_pat.match and nothing followed. Sent from Mail Master On 07/05/2017 08:31, [1]Cameron Simpson wrote: On 04Jul2017 17:01, Mayling ge wrote: > My function is in the following way to handle file line by line. There are > multiple error patterns defined and need to apply to each line. I use > multiprocessing.Pool to handle the file in block. > > The memory usage increases to 2G for a 1G file. And stays in 2G even after > the file processing. File closed in the end. > > If I comment out the call to re_pat.match, memory usage is normal and > keeps under 100Mb. [...] > > def line_match(lines, errors) > for error in errors: > try: > re_pat = re.compile(error['pattern']) > except Exception: > print_error > continue > for line in lines: > m = re_pat.match(line) > # other code to handle matched object [...] > Notes: I omit some code as I think the significant difference is > with/without re_pat.match(...) Hmm. Does the handling code (omitted) keep the line or match object in memory? If leaving out the "m = re_pat.match(line)" triggers the leak, and presuming that line itself doesn't leak, then I would start to suspect the handling code is not letting go of the match object "m" or of the line (which is probably attached to the match object "m" to support things like m.group() and so forth). So you might need to show us the handling code. Cheers, Cameron Simpson References Visible links 1. mailto:cs at zip.com.au From kwpolska at gmail.com Wed Jul 5 03:27:15 2017 From: kwpolska at gmail.com (Chris Warrick) Date: Wed, 5 Jul 2017 09:27:15 +0200 Subject: Privy: An easy, fast lib to password-protect your data In-Reply-To: <3d9353da-01b0-4de5-b16a-d067d20f9487@googlegroups.com> References: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> <3d9353da-01b0-4de5-b16a-d067d20f9487@googlegroups.com> Message-ID: On 5 July 2017 at 02:39, wrote: > On Monday, July 3, 2017 at 10:36:39 PM UTC-4, ofekm... at gmail.com wrote: >> https://github.com/ofek/privy > > Bump b/c spam > -- > https://mail.python.org/mailman/listinfo/python-list The person spamming right now would be you. You just posted a link, without any explanations, any marketing blurbs, nothing. Why would I use your tool instead of something established, that has been properly audited ? say, PGP for example? How do I know your one-man project has no security holes, backdoors, or other vulnerabilities? How do I know that the encryption method chosen by you is sound? If there is no leaked data? And I really dislike the description of your project: > Privy is a small and fast utility for password-protecting secret data such as API keys, cryptocurrency wallets, or seeds for digital signatures. What does ?password-protecting? mean? Why is this not ?encrypting?? How do you expect this to work with API keys? -- Chris Warrick PGP: 5EAAEA16 From __peter__ at web.de Wed Jul 5 03:28:48 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 05 Jul 2017 09:28:48 +0200 Subject: Generator - Provide parameters to url - requests References: <328f42bb-ba23-4199-9f3a-9ec1829bc448@googlegroups.com> Message-ID: Sayth Renshaw wrote: > Hi > > I am struggling to figure out how I can create a generator to provide > values to my url. My url needs to insert the year month and day in the url > not as params to the url. > > > import json > import requests > import datetime > > # using this I can create a list of dates for the first 210 days of this > # year. > > base = datetime.datetime(2017,1,1) > > def datesRange(max, min): > date_list = (base - datetime.timedelta(days=x) for x in > range(max,min)) DAY = date_list.day > MONTH = date_list.month > YEAR = date_list.year > yield DAY, MONTH, YEAR A single yield usually doesn't make sense -- you need one loop to generate the data ONE_DAY = datetime.timedelta(days=1) def dates(first, numdays): # generate datetime objects for extra clarity # note there are no implicit arguments like `base` in your code for _ in range(numdays): yield first first += ONE_DAY ... and another one to consume it def create_url(date): return "https:/example.com/{0.year}/{0.month}/{0.day}/".format( date ) def create_filename(date): # use fixed widths for month and day to avoid ambiguous # filenames, e. g. is "2017111.json" jan-11 or nov-1? return "{0.year}{0.month:02}{0.day:02}.json".format(date) FIRST = datetime.datetime(2017, 1, 1) for date in dates(FIRST, numdays=210): url = create_url(date) r = requests.get(url) filename = create_filename(date) with open(filename, "w") as f: ... > dateValues = datesRange(-210,0) > > def create_url(day, month, year): > https://api.tatts.com/sales/vmax/web/data/racing/{0}/{1}/{2}/sr/full".format(YEAR,MONTH,DAY) > return url > > Then I want to insert them in this url one at a time from the generator > > try: > r = requests.get(INSERT_URL_HERE) > if r.status_code == requests.codes.ok then: > # open a file for writing using url paramters > with open(SR + DAY + MONTH + YEAR + '.json', 'w') as f: > # Do stuff from here not relevant to question. > > I have just gotten lost. > > Is there an easier way to go about this? > > Cheers > > Sayth From frank at chagford.com Wed Jul 5 03:50:23 2017 From: frank at chagford.com (Frank Millman) Date: Wed, 5 Jul 2017 09:50:23 +0200 Subject: Generator - Provide parameters to url - requests In-Reply-To: <328f42bb-ba23-4199-9f3a-9ec1829bc448@googlegroups.com> References: <328f42bb-ba23-4199-9f3a-9ec1829bc448@googlegroups.com> Message-ID: "Sayth Renshaw" wrote in message news:328f42bb-ba23-4199-9f3a-9ec1829bc448 at googlegroups.com... > > Hi > > I am struggling to figure out how I can create a generator to provide > values to my url. My url needs to insert the year month and day in the url > not as params to the url. > > import json > import requests > import datetime > > # using this I can create a list of dates for the first 210 days of this > year. > > base = datetime.datetime(2017,1,1) > > def datesRange(max, min): > date_list = (base - datetime.timedelta(days=x) for x in > range(max,min)) > DAY = date_list.day > MONTH = date_list.month > YEAR = date_list.year > yield DAY, MONTH, YEAR > > dateValues = datesRange(-210,0) Are you sure that this works? The easiest way to test it is - >>> list(datesRange(-210, 0)) If I try this I get AttributeError: 'generator object has no attribute 'day' I would write it like this - def datesRange(max, min): for day in range(max, min): date = base - datetime.timedelta(days=day) yield date.day, date.month, date.year Actually, I have just read Peter's response, and his version is much better, but this one is closer to your original code. > > def create_url(day, month, year): > > https://api.tatts.com/sales/vmax/web/data/racing/{0}/{1}/{2}/sr/full".format(YEAR,MONTH,DAY) > return url > > Then I want to insert them in this url one at a time from the generator > To do this, you need some kind of loop to iterate over your generator - for day, month, year in datesRange(-210, 0): # do something Does this help? Frank Millman From sjeik_appie at hotmail.com Wed Jul 5 03:52:12 2017 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Wed, 5 Jul 2017 07:52:12 +0000 Subject: memory leak with re.match In-Reply-To: References: Message-ID: From: Python-list on behalf of Mayling ge Sent: Tuesday, July 4, 2017 9:01 AM To: python-list Subject: memory leak with re.match ? ?? Hi, ?? My function is in the following way to handle file line by line. There are ?? multiple error patterns? defined and? need to apply? to each? line. I? use ?? multiprocessing.Pool to handle the file in block. ?? The memory usage increases to 2G for a 1G file. And stays in 2G even after ?? the file processing. File closed in the end. ?? If I comment? out the? call to re_pat.match,? memory usage? is normal? and ?? keeps under 100Mb. ?? am I using re in a wrong way? I cannot figure out a way to fix the? memory ?? leak. And I googled . ?? def line_match(lines, errors) ?? ?????????? lines = list(itertools.islice(fo, line_per_proc)) ===> do you really need to listify the iterator? ?????????? if not lines: ?????????????? break ?????????? result = p.apply_async(line_match, args=(errors, lines)) ===> the signature of line_match is (lines, errors), in args you do (errors, lines) From pozzugno at gmail.com Wed Jul 5 03:56:53 2017 From: pozzugno at gmail.com (pozz) Date: Wed, 5 Jul 2017 09:56:53 +0200 Subject: Python threading and sharing variables Message-ID: I'd like to launch *and control* a long thread. I want to print the progress of the long thread in the main thread. It's a GUI script, here it's a console script only to simplify. import threading import time class MyClass: def start(self): self.max = 5 self.pause = 1 t = threading.Thread(target=self.thread) t.start() i = -1 while self.cnt != self.max - 1: if i != self.cnt: print("{:d}".format(self.cnt)) i = self.cnt print("Finished") def thread(self): for i in range(self.max): self.cnt = i time.sleep(self.pause) c = MyClass() c.start() It seems it works, but I'm not sure it is the correct way to share the variable self.cnt. It is only written in the long thread and only read in the main thread. Could a single Python instruction be interrupted (in this case, self.cnt = i)? Should I use a locking mechanism when reading/writing? What about if the variable is more complex, for example a list or dictionary? Even in this case, is it safe to avoid locking on a shared variable if the operation on the variable is performed in a single Python instruction? From pozzugno at gmail.com Wed Jul 5 04:05:14 2017 From: pozzugno at gmail.com (pozz) Date: Wed, 5 Jul 2017 10:05:14 +0200 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: Il 05/07/2017 09:56, pozz ha scritto: > [...] > It seems it works, but I'm not sure it is the correct way to share the > variable self.cnt. It is only written in the long thread and only read > in the main thread. > Could a single Python instruction be interrupted (in this case, self.cnt > = i)? Should I use a locking mechanism when reading/writing? > > What about if the variable is more complex, for example a list or > dictionary? Even in this case, is it safe to avoid locking on a shared > variable if the operation on the variable is performed in a single > Python instruction? Ok, maybe this atomic behaviour depends on the Python implementation, so it's better to avoid relying on atomicity and use a lock to access shared variables from different running thread. However in my simple case one thread writes the variable and the other reads it. In this case is it safe to avoid locks? From tomuxiong at gmx.com Wed Jul 5 04:20:16 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 10:20:16 +0200 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: <5d6daade-35f1-d773-a493-aebc8e3a9a22@gmx.com> On 07/05/2017 09:56 AM, pozz wrote: > It seems it works, but I'm not sure it is the correct way to share the > variable self.cnt. It is only written in the long thread and only read > in the main thread. > Could a single Python instruction be interrupted (in this case, self.cnt > = i)? Should I use a locking mechanism when reading/writing? > > What about if the variable is more complex, for example a list or > dictionary? Even in this case, is it safe to avoid locking on a shared > variable if the operation on the variable is performed in a single > Python instruction? I think it would be clearer if you used a queue. Here's an example of simplified version showing how the communication might work: test.py --------------------------- from threading import Thread from queue import Queue from time import sleep def main(): q = Queue() t = Thread(target=worker, args=(q,)) t.start() while True: status = q.get() if status < 0: break print(status) t.join() def worker(q, limit=5): for i in range(limit): sleep(1) # Simulate some work q.put(i) q.put(-1) # Some sort of value to indicate being finished main() -------------------------------- $ python3 test.py 0 1 2 3 4 Not sure if this helps, but I personally find it clearer than the shared class variable method you're using. Cheers, Thomas From mal at europython.eu Wed Jul 5 04:23:46 2017 From: mal at europython.eu (M.-A. Lemburg) Date: Wed, 5 Jul 2017 10:23:46 +0200 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> Message-ID: On 05.07.2017 00:05, Terry Reedy wrote: > On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: >> We are very pleased to have Intel as Diamond Sponsor for EuroPython >> 2017. You can visit them at the most central booth in our exhibit >> area, the Sala della Piazza, and take the opportunity to chat with >> their staff. >> >> Please find below a hosted blog post from Intel, that offers us an >> exciting glimpse at the recently released free, Intel? Distribution >> for Python: >> >> http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for >> > > I looked but did not find the most important thing. > What version of Python? > > Also, if 3.6 rather than 2.7, do they plan to keep up to date? > Free for how long? 90 days? Until further notice? Indefinitely? I'm sure they will be happy to answer all those questions at their booth :-) More details are available here, if you can't attend EuroPython: https://software.intel.com/en-us/distribution-for-python (look for e.g. "key specifications") -- Marc-Andre Lemburg EuroPython Society Chair http://www.europython-society.org/ http://www.malemburg.com/ From tomuxiong at gmx.com Wed Jul 5 04:26:09 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 10:26:09 +0200 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On 07/05/2017 10:05 AM, pozz wrote: > > Ok, maybe this atomic behaviour depends on the Python implementation, so > it's better to avoid relying on atomicity and use a lock to access > shared variables from different running thread. > > However in my simple case one thread writes the variable and the other > reads it. In this case is it safe to avoid locks? I think the general rule would be that no it's not safe to skip the locks. It's true that with cpython, your method shouldn't run into problems, but that's just a quirk of how you're using it. I look at from another perspective, if it's true that no locks actually are necessary, then why are you using the shared variables in the first place. In this case, the information needs to be send from the worker thread to the main thread, but you don't need for any other threads to see it. This only really requires a single "channel" (e.g. a queue) and not for the variable to further exposed. Also personally I just find it much clearer. I was very confused what your code was doing and basically need to step through it to understand. Cheers, Thomas From rosuav at gmail.com Wed Jul 5 04:26:28 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Jul 2017 18:26:28 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 5:56 PM, pozz wrote: > It seems it works, but I'm not sure it is the correct way to share the > variable self.cnt. It is only written in the long thread and only read in > the main thread. > Could a single Python instruction be interrupted (in this case, self.cnt = > i)? Should I use a locking mechanism when reading/writing? > > What about if the variable is more complex, for example a list or > dictionary? Even in this case, is it safe to avoid locking on a shared > variable if the operation on the variable is performed in a single Python > instruction? You can be confident that a single assignment will happen atomically. Even if "self.cnt = i" requires multiple instructions to perform (which it probably doesn't), there's still going to be some moment before the change has happened at all, and then some moment when the change has completely happened, and you won't get a context switch in between. This is NOT the case if you try to do an increment (eg "self.cnt += 1"), but for what you're doing here, it should be fine. That said, though, you may still find that a queue is better, as per Thomas's suggestion. ChrisA From tomuxiong at gmx.com Wed Jul 5 04:28:53 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 10:28:53 +0200 Subject: Python threading and sharing variables In-Reply-To: <5d6daade-35f1-d773-a493-aebc8e3a9a22@gmx.com> References: <5d6daade-35f1-d773-a493-aebc8e3a9a22@gmx.com> Message-ID: <2957e72f-4940-49dd-897b-ede2748f1ca9@gmx.com> On 07/05/2017 10:20 AM, Thomas Nyberg wrote: > [...snip...] Btw I forgot to mention that you'd probably want to use q.get_nowait() instead of q.get() in my code example if you don't want the main thread to block (which what I think you want to avoid from your code example). https://docs.python.org/3.7/library/queue.html#queue.Queue.get_nowait Cheers, Thomas From maylinge0903 at gmail.com Wed Jul 5 04:36:04 2017 From: maylinge0903 at gmail.com (Mayling ge) Date: Wed, 5 Jul 2017 16:36:04 +0800 Subject: memory leak with re.match In-Reply-To: References: Message-ID: <2442BE14-8E5C-4C52-BDE7-9F0F4F25F447@gmail.com> Sorry. The code here is just to describe the issue and is just pseudo code, please forgive some typo. I list out lines because I need line context. Sent from Mail Master On 07/05/2017 15:52, [1]Albert-Jan Roskam wrote: From: Python-list on behalf of Mayling ge Sent: Tuesday, July 4, 2017 9:01 AM To: python-list Subject: memory leak with re.match Hi, My function is in the following way to handle file line by line. There are multiple error patterns defined and need to apply to each line. I use multiprocessing.Pool to handle the file in block. The memory usage increases to 2G for a 1G file. And stays in 2G even after the file processing. File closed in the end. If I comment out the call to re_pat.match, memory usage is normal and keeps under 100Mb. am I using re in a wrong way? I cannot figure out a way to fix the memory leak. And I googled . def line_match(lines, errors) lines = list(itertools.islice(fo, line_per_proc)) ===> do you really need to listify the iterator? if not lines: break result = p.apply_async(line_match, args=(errors, lines)) ===> the signature of line_match is (lines, errors), in args you do (errors, lines) References Visible links 1. mailto:sjeik_appie at hotmail.com From rosuav at gmail.com Wed Jul 5 04:40:10 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Jul 2017 18:40:10 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 6:26 PM, Thomas Nyberg wrote: > I think the general rule would be that no it's not safe to skip the > locks. It's true that with cpython, your method shouldn't run into > problems, but that's just a quirk of how you're using it. I look at from > another perspective, if it's true that no locks actually are necessary, > then why are you using the shared variables in the first place. In this > case, the information needs to be send from the worker thread to the > main thread, but you don't need for any other threads to see it. This > only really requires a single "channel" (e.g. a queue) and not for the > variable to further exposed. What would the lock surround? The purpose of a lock is to make an atomic unit out of something that otherwise wouldn't be, but a single store operation is already atomic. So you could do something like: with lock(): self.cnt = self.cnt + 1 and then multiple threads could safely increment the counter, which they otherwise couldn't (note that "self.cnt += 1" might be safe, but probably wouldn't, so I use the longhand); but even with that form of increment, it's only necessary if there are multiple writers. A pub-sub model (one writer, any number of readers) only needs locks if it's possible for the write itself to be half done, which can't happen with a single operation. I'm basing my information primarily on CPython here, where context switches are protected by the language interpreter and the GIL. But other Python implementations have similar guarantees. If there's any Python implementation in which a simple assignment can cause problems, I'd call that a bug to be fixed - especially since, at the CPU level, you can generally rely on a pointer-sized memory store being atomic. But I do agree with the recommendation of the queue. That does make things clearer. ChrisA From tomuxiong at gmx.com Wed Jul 5 04:57:24 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 10:57:24 +0200 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On 07/05/2017 10:40 AM, Chris Angelico wrote: > What would the lock surround? Sorry yes I agree with you that no lock is needed in this method. I was a bit confused by the code and probably was thinking something like "a += 1" in my head (even though that's not what he was doing...). Thanks for clearing that up! Regardless, depending upon what he means by this, he'll probably need some form of syncronization at some point: > What about if the variable is more complex, for example a list or dictionary? Even in this case, is it safe to avoid locking on a shared variable if the operation on the variable is performed in a single Python instruction? Cheers, Thomas From flebber.crue at gmail.com Wed Jul 5 04:59:38 2017 From: flebber.crue at gmail.com (Sayth Renshaw) Date: Wed, 5 Jul 2017 01:59:38 -0700 (PDT) Subject: Generator - Provide parameters to url - requests In-Reply-To: References: <328f42bb-ba23-4199-9f3a-9ec1829bc448@googlegroups.com> Message-ID: <4e069528-7b68-469c-8e8e-f8439c5bd515@googlegroups.com> Thanks. I left "base" out as i was trying to remove as much uneeded code from example as possible. I had defined it as base = datetime.datetime(2017,1,1) Reading your code this sounds to simple :-). def dates(first, numdays): # generate datetime objects for extra clarity # note there are no implicit arguments like `base` in your code for _ in range(numdays): yield first first += ONE_DAY Thanks Sayth From rosuav at gmail.com Wed Jul 5 05:07:56 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Jul 2017 19:07:56 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 6:57 PM, Thomas Nyberg wrote: > On 07/05/2017 10:40 AM, Chris Angelico wrote: >> What would the lock surround? > > Sorry yes I agree with you that no lock is needed in this method. I was > a bit confused by the code and probably was thinking something like "a > += 1" in my head (even though that's not what he was doing...). Thanks > for clearing that up! > > Regardless, depending upon what he means by this, he'll probably need > some form of syncronization at some point: It all depends on the meaning of "and control" from the OP. The example code is all about getting status, which can always be done atomically (if you need a complex object, you simply construct a new one every time, atomically place it in the shared location, and on reading, always take a reference atomically before digging deeper into it), but other types of command-and-control would need more flexibility. As a general rule, I try to avoid using locks *per se* in my code. It's usually clearer to have clear communication channels, generally unidirectional. Locks are reserved for the weird cases where you're doing something really unusual. ChrisA From mal at europython.eu Wed Jul 5 05:37:28 2017 From: mal at europython.eu (M.-A. Lemburg) Date: Wed, 5 Jul 2017 11:37:28 +0200 Subject: =?UTF-8?Q?EuroPython_2017:_Beginners=e2=80=99_Day_workshop_revived?= Message-ID: Our Beginners? Day host Harry Percival cannot attend EuroPython due to personal reasons, but thanks to our brilliant community, we have managed to find trainers who are willing to help out and run the workshop: * Ilian Iliev * Juan Manuel Santos * Petr Viktorin * Lasse Schuirmann * Micha? Bultrowicz A big thanks for the quick offers of help. So once more, we?re pleased to present the... * Beginners? Day Workshop * https://ep2017.europython.eu/en/events/beginners-day/ We will have a Beginners? Day workshop, on Sunday, July 9th, from 10:00 until 17:00, at the Palacongressi di Rimini (Via della Fiera 23, Rimini), the same location as the main conference. The session will be presented in English (although a few of the coaches do speak other languages as well). Please bring your laptop, as a large part of the day will be devoted to learning Python on your own PC. For more information and the session list, please see the Beginners? Day workshop page on our website: https://ep2017.europython.eu/en/events/beginners-day/ Enjoy, -- EuroPython 2017 Team http://ep2017.europython.eu/ http://www.europython-society.org/ PS: Please forward or retweet to help us reach all interested parties: https://twitter.com/europython/status/882532425636753408 Thanks. From __peter__ at web.de Wed Jul 5 05:42:33 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 05 Jul 2017 11:42:33 +0200 Subject: memory leak with re.match References: <2442BE14-8E5C-4C52-BDE7-9F0F4F25F447@gmail.com> Message-ID: Mayling ge wrote: > Sorry. The code here is just to describe the issue and is just pseudo > code, That is the problem with your post. It's too vague for us to make sense of it. Can you provide a minimal example that shows what you think is a "memory leak"? Then we can either help you avoid storing extra stuff or confirm an actual leak and help you prepare a bug report. From __peter__ at web.de Wed Jul 5 06:13:53 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 05 Jul 2017 12:13:53 +0200 Subject: Generator - Provide parameters to url - requests References: <328f42bb-ba23-4199-9f3a-9ec1829bc448@googlegroups.com> <4e069528-7b68-469c-8e8e-f8439c5bd515@googlegroups.com> Message-ID: Sayth Renshaw wrote: > Thanks. > > I left "base" out as i was trying to remove as much uneeded code from > example as possible. I had defined it as > > base = datetime.datetime(2017,1,1) You actually did provide that line in your post. > Reading your code this sounds to simple :-). > > def dates(first, numdays): > # generate datetime objects for extra clarity > # note there are no implicit arguments like `base` in your code > for _ in range(numdays): > yield first > first += ONE_DAY > > Thanks You could write the above generator ONE_DAY = datetime.timedelta(days=1) base = datetime.datetime(2017, 1, 1) def dates(numdays): date = base for _ in range(numdays): yield date date += ONE_DAY but this is bad design. You get the most predictable output when you write functions in such a way that the result only depends on the function's arguments. Such functions are called "pure", and are much easier to reason about and to cover with unit tests. As Python does not provide constants this is sometimes a judgment call: While ONE_DAY will never be changed base is likely to be changed to base = datetime.datetime(2018, 1, 1) next year. Therefore it should be an argument. If you were to change the generator to support varying intervals the signature should be changed, too, to def dates(first, numdays, daystep=1): # ... or similar. From rhodri at kynesim.co.uk Wed Jul 5 06:40:58 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 5 Jul 2017 11:40:58 +0100 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: <50e55943-3d71-616e-b861-ddd1f83629b0@kynesim.co.uk> On 05/07/17 09:26, Chris Angelico wrote: > On Wed, Jul 5, 2017 at 5:56 PM, pozz wrote: >> It seems it works, but I'm not sure it is the correct way to share the >> variable self.cnt. It is only written in the long thread and only read in >> the main thread. >> Could a single Python instruction be interrupted (in this case, self.cnt = >> i)? Should I use a locking mechanism when reading/writing? >> >> What about if the variable is more complex, for example a list or >> dictionary? Even in this case, is it safe to avoid locking on a shared >> variable if the operation on the variable is performed in a single Python >> instruction? > > You can be confident that a single assignment will happen atomically. > Even if "self.cnt = i" requires multiple instructions to perform > (which it probably doesn't), there's still going to be some moment > before the change has happened at all, and then some moment when the > change has completely happened, and you won't get a context switch in > between. Is there a definition of what is or isn't atomic behaviour anywhere? As an embedded C programmer I definitely wouldn't assume that a high-level assignment (in all its ref-counting glory) would be atomic without some hint of proof :-) -- Rhodri James *-* Kynesim Ltd From blahBlah at blah.org Wed Jul 5 07:31:31 2017 From: blahBlah at blah.org (Sam Chats) Date: Wed, 5 Jul 2017 11:31:31 +0000 (UTC) Subject: School Management System in Python Message-ID: Feel free to comment on my high school project. I really enjoyed building it and it is the biggest project I've developed so far (in terms of lines of code). All you need to do is to run the S-Koo-L.py script. I've built more eye-catchy things with less code (see my other repos), but this does a lot more than perhaps all of them, while having a command line REPL interface. One neat thing I wanted to mention is that this project has absolutely no third-party dependencies. https://github.com/schedutron/S-Koo-L Sam Chats From tomuxiong at gmx.com Wed Jul 5 07:48:46 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 13:48:46 +0200 Subject: School Management System in Python In-Reply-To: References: Message-ID: <0f382fe6-46af-f2c8-8378-40f717b3d023@gmx.com> On 07/05/2017 01:31 PM, Sam Chats wrote: > Feel free to comment on my high school project. I really enjoyed building it and it is the biggest project I've developed so far > (in terms of lines of code). All you need to do is to run the S-Koo-L.py script. > > I've built more eye-catchy things with less code (see my other repos), but this does a lot more than perhaps all of them, > while having a command line REPL interface. One neat thing I wanted to mention is that this project has absolutely no > third-party dependencies. > > https://github.com/schedutron/S-Koo-L > > Sam Chats > Just a few comments: 1. You probably shouldn't add *.pyc files like that to your source control since they're binary files. 2. I don't think you should be dumping pickeled objects to the log file. It's much nicer if you just write text to it. For example, you're pickling a date object to the log file, but you could instead just write the string corresponding to it. 3. Personally I think you should probably stop using pickle files entirely here. It seems like you could just as easily store them in a text readable format and then load the data (say names) in line by line. You might want to look into the csv module. 4. You probably want some documentation walking through what this should be doing. Cheers, Thomas From binaryboy010 at gmail.com Wed Jul 5 08:03:53 2017 From: binaryboy010 at gmail.com (Sam Chats) Date: Wed, 5 Jul 2017 12:03:53 +0000 (UTC) Subject: School Management System in Python References: Message-ID: Check message Sorry for this message. Sam From blahBlah at blah.org Wed Jul 5 08:14:16 2017 From: blahBlah at blah.org (Sam Chats) Date: Wed, 5 Jul 2017 12:14:16 +0000 (UTC) Subject: School Management System in Python References: Message-ID: Thanks for your suggestions. I would've not used pickle had I been aware about other tools while developing this. I was thinking about migrating to sqlite3. How about that? And yes, I need more comprehanesive documentation. Will work on that soon. Thanks, Sam Chats From __peter__ at web.de Wed Jul 5 08:14:45 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 05 Jul 2017 14:14:45 +0200 Subject: Python threading and sharing variables References: Message-ID: Chris Angelico wrote: > You can be confident that a single assignment will happen atomically. > Even if "self.cnt = i" requires multiple instructions to perform For name binding cnt = i maybe, but self.cnt = i can execute arbitrary Python code (think __setattr__()). With threads I'd rather play it safe. From blahBlah at blah.org Wed Jul 5 08:39:15 2017 From: blahBlah at blah.org (YOUR_NAME_HERE) Date: Wed, 5 Jul 2017 12:39:15 +0000 (UTC) Subject: School Management System in Python References: Message-ID: Thanks for your suggestions. I would've not used pickle had I been aware about other tools while developing this. I was thinking about migrating to sqlite3. How about that? And yes, I need more comprehanesive documentation. Will work on that soon. Thanks, Sam Chats From tomuxiong at gmx.com Wed Jul 5 08:49:07 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 14:49:07 +0200 Subject: School Management System in Python In-Reply-To: References: Message-ID: <8fbe8c6b-c479-b741-3beb-7f8f5330d1fa@gmx.com> On 07/05/2017 02:14 PM, Sam Chats wrote: > Thanks for your suggestions. I would've not used pickle had I been aware about other tools while developing this. > I was thinking about migrating to sqlite3. How about that? And yes, I need more comprehanesive documentation. > Will work on that soon. > > Thanks, > Sam Chats > Personally I prefer text formats until I have some need to switch. That way I can look at files directly instead of needing to unpickle or to load up sqlite or whatever. It just seems like overkill when it's unnecessary. Depending upon how you are updating data, using sqlite or some database might make sense, but if you're just reading in or writing out entire text files, then I'd just recommend skipping sqlite and instead just writing to the files directly. Cheers, Thomas From mail at timgolden.me.uk Wed Jul 5 08:56:36 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Wed, 5 Jul 2017 13:56:36 +0100 Subject: School Management System in Python In-Reply-To: <8fbe8c6b-c479-b741-3beb-7f8f5330d1fa@gmx.com> References: <8fbe8c6b-c479-b741-3beb-7f8f5330d1fa@gmx.com> Message-ID: <030f3b39-768a-d000-87f0-3e33bc881d7c@timgolden.me.uk> On 05/07/2017 13:49, Thomas Nyberg wrote: > On 07/05/2017 02:14 PM, Sam Chats wrote: >> Thanks for your suggestions. I would've not used pickle had I been aware about other tools while developing this. >> I was thinking about migrating to sqlite3. How about that? And yes, I need more comprehanesive documentation. >> Will work on that soon. >> >> Thanks, >> Sam Chats >> > Personally I prefer text formats until I have some need to switch. That > way I can look at files directly instead of needing to unpickle or to > load up sqlite or whatever. It just seems like overkill when it's > unnecessary. Depending upon how you are updating data, using sqlite or > some database might make sense, but if you're just reading in or writing > out entire text files, then I'd just recommend skipping sqlite and > instead just writing to the files directly. There's been some discussion recently on the Computing At School forums here in the UK where at least one teacher explained that they taught pickle in the way it's being used here essentially because it's really simple: you just through your object at .dump/.load and it's done. Otherwise you have to roll your own serialisation of some sort. Which might be simple but is yet another thing to introduce into an already busy curriculum. TJG From blahBlah at blah.org Wed Jul 5 09:02:36 2017 From: blahBlah at blah.org (YOUR_NAME_HERE) Date: Wed, 5 Jul 2017 13:02:36 +0000 (UTC) Subject: School Management System in Python References: Message-ID: I can use either tsv or csv. Which one would be better? From rosuav at gmail.com Wed Jul 5 09:03:52 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Jul 2017 23:03:52 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 10:14 PM, Peter Otten <__peter__ at web.de> wrote: > Chris Angelico wrote: > >> You can be confident that a single assignment will happen atomically. >> Even if "self.cnt = i" requires multiple instructions to perform > > For name binding > > cnt = i > > maybe, but > > self.cnt = i > > can execute arbitrary Python code (think __setattr__()). With threads I'd > rather play it safe. Sure, it _could_ execute arbitrary code, but the most likely case is that at its core, it's still going to execute a single assignment operation. And if it doesn't, then that's the place where you'd need the lock. ChrisA From rosuav at gmail.com Wed Jul 5 09:09:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Jul 2017 23:09:48 +1000 Subject: Python threading and sharing variables In-Reply-To: <50e55943-3d71-616e-b861-ddd1f83629b0@kynesim.co.uk> References: <50e55943-3d71-616e-b861-ddd1f83629b0@kynesim.co.uk> Message-ID: On Wed, Jul 5, 2017 at 8:40 PM, Rhodri James wrote: > On 05/07/17 09:26, Chris Angelico wrote: >> >> On Wed, Jul 5, 2017 at 5:56 PM, pozz wrote: >>> >>> It seems it works, but I'm not sure it is the correct way to share the >>> variable self.cnt. It is only written in the long thread and only read in >>> the main thread. >>> Could a single Python instruction be interrupted (in this case, self.cnt >>> = >>> i)? Should I use a locking mechanism when reading/writing? >>> >>> What about if the variable is more complex, for example a list or >>> dictionary? Even in this case, is it safe to avoid locking on a shared >>> variable if the operation on the variable is performed in a single Python >>> instruction? >> >> >> You can be confident that a single assignment will happen atomically. >> Even if "self.cnt = i" requires multiple instructions to perform >> (which it probably doesn't), there's still going to be some moment >> before the change has happened at all, and then some moment when the >> change has completely happened, and you won't get a context switch in >> between. > > > Is there a definition of what is or isn't atomic behaviour anywhere? As an > embedded C programmer I definitely wouldn't assume that a high-level > assignment (in all its ref-counting glory) would be atomic without some hint > of proof :-) In CPython, yes, because thread switching always takes place between Python bytecode instructions. In other Pythons, I don't know about actual guarantees, but if it's possible for a simple assignment to NOT be atomic, it would lead to internal corruption (eg messing with garbage collection and/or memory pointers), so it would be the job of the interpreter, not the Python code itself - because there'd be no way to reliably do *anything* without some sort of lock... including acquiring a lock. So it *has* to be solved at a lower level. ChrisA From auto at advisor.org Wed Jul 5 09:18:38 2017 From: auto at advisor.org (YOUR_NAME_HERE) Date: Wed, 5 Jul 2017 13:18:38 +0000 (UTC) Subject: School Management System in Python References: Message-ID: On Wed, 5 Jul 2017 13:02:36 +0000 (UTC) YOUR_NAME_HERE wrote: > I can use either tsv or csv. Which one would be better? Some people complain that tsv has problems, so maybe csv would be the way to go. From tomuxiong at gmx.com Wed Jul 5 09:28:51 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 15:28:51 +0200 Subject: School Management System in Python In-Reply-To: References: Message-ID: <3b464032-460e-1baf-8ba0-bcf9473f2638@gmx.com> On 07/05/2017 03:18 PM, YOUR_NAME_HERE wrote: > On Wed, 5 Jul 2017 13:02:36 +0000 (UTC) YOUR_NAME_HERE wrote: >> I can use either tsv or csv. Which one would be better? > > > Some people complain that tsv has problems, so maybe csv would be the way to go. > I almost always use csv personally, but it's a preference. I'm not sure what the problems are you're refering to, but I guess that points to using commas as well. Either way, it's not hard to switch between the two: import csv # Using regular commas with open('outfile.csv', 'w') as outfile: writer = csv.writer(outfile) writer.writerow(range(5)) # Using tabs with open('outfile.tsv', 'w') as outfile: writer = csv.writer(outfile, delimiter='\t') writer.writerow(range(5)) Cheers, Thomas From blahBlah at blah.org Wed Jul 5 09:30:07 2017 From: blahBlah at blah.org (YOUR_NAME_HERE) Date: Wed, 5 Jul 2017 13:30:07 +0000 (UTC) Subject: School Management System in Python References: Message-ID: On Wed, 5 Jul 2017 15:28:51 +0200, Thomas Nyberg wrote: > On 07/05/2017 03:18 PM, YOUR_NAME_HERE wrote: > > On Wed, 5 Jul 2017 13:02:36 +0000 (UTC) YOUR_NAME_HERE wrote: > >> I can use either tsv or csv. Which one would be better? > > > > > > Some people complain that tsv has problems, so maybe csv would be the way to go. > > > I almost always use csv personally, but it's a preference. I'm not sure > what the problems are you're refering to, but I guess that points to > using commas as well. Either way, it's not hard to switch between the two: > > import csv > > # Using regular commas > with open('outfile.csv', 'w') as outfile: > writer = csv.writer(outfile) > writer.writerow(range(5)) > > # Using tabs > with open('outfile.tsv', 'w') as outfile: > writer = csv.writer(outfile, delimiter=' ') > writer.writerow(range(5)) > > Cheers, > Thomas Hey that was simple enough! Thanks for the code! I was also considering the use of JSON. Which one would be better? From tomuxiong at gmx.com Wed Jul 5 09:31:57 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 15:31:57 +0200 Subject: School Management System in Python In-Reply-To: <030f3b39-768a-d000-87f0-3e33bc881d7c@timgolden.me.uk> References: <8fbe8c6b-c479-b741-3beb-7f8f5330d1fa@gmx.com> <030f3b39-768a-d000-87f0-3e33bc881d7c@timgolden.me.uk> Message-ID: <17c8b8f6-379e-aa45-922c-34b5eadad798@gmx.com> On 07/05/2017 02:56 PM, Tim Golden wrote: > There's been some discussion recently on the Computing At School forums > here in the UK where at least one teacher explained that they taught > pickle in the way it's being used here essentially because it's really > simple: you just through your object at .dump/.load and it's done. > Otherwise you have to roll your own serialisation of some sort. Which > might be simple but is yet another thing to introduce into an already > busy curriculum. > > TJG > That's certainly good reasoning that I hadn't thought of. Makes a bit more sense as to why I see it used like this so often. And it certainly does make sense to not get bogged down in every different detail when beginning so I see why one would teach this way. In fact I feel like it could have been a good idea in the different times I've taught people this sort of thing... Thanks for the perspective! Cheers, Thomas From saurabh.chaturvedi63 at gmail.com Wed Jul 5 09:34:26 2017 From: saurabh.chaturvedi63 at gmail.com (Sam Chats) Date: Wed, 5 Jul 2017 06:34:26 -0700 (PDT) Subject: School Management System in Python In-Reply-To: References: <3b464032-460e-1baf-8ba0-bcf9473f2638@gmx.com> Message-ID: On Wednesday, July 5, 2017 at 6:56:06 PM UTC+5:30, Thomas Nyberg wrote: > On 07/05/2017 03:18 PM, YOUR_NAME_HERE wrote: > > On Wed, 5 Jul 2017 13:02:36 +0000 (UTC) YOUR_NAME_HERE wrote: > >> I can use either tsv or csv. Which one would be better? > > > > > > Some people complain that tsv has problems, so maybe csv would be the way to go. > > > I almost always use csv personally, but it's a preference. I'm not sure > what the problems are you're refering to, but I guess that points to > using commas as well. Either way, it's not hard to switch between the two: > > import csv > > # Using regular commas > with open('outfile.csv', 'w') as outfile: > writer = csv.writer(outfile) > writer.writerow(range(5)) > > # Using tabs > with open('outfile.tsv', 'w') as outfile: > writer = csv.writer(outfile, delimiter='\t') > writer.writerow(range(5)) > > Cheers, > Thomas Just curious, is it better, performance wise, to read from a text file (css or tsv) compared to reading from a binary pickle file? From christopher_reimer at icloud.com Wed Jul 5 09:48:31 2017 From: christopher_reimer at icloud.com (Christopher Reimer) Date: Wed, 05 Jul 2017 06:48:31 -0700 Subject: School Management System in Python In-Reply-To: References: <3b464032-460e-1baf-8ba0-bcf9473f2638@gmx.com> Message-ID: On Jul 5, 2017, at 6:34 AM, Sam Chats wrote: > Just curious, is it better, performance wise, to read from a text file (css or tsv) compared to reading from a binary pickle file? I prefer CSV because I can load the file into Microsoft Excel and do a quick search. Chris R. From eryksun at gmail.com Wed Jul 5 09:50:41 2017 From: eryksun at gmail.com (eryk sun) Date: Wed, 5 Jul 2017 13:50:41 +0000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 12:14 PM, Peter Otten <__peter__ at web.de> wrote: > Chris Angelico wrote: > >> You can be confident that a single assignment will happen atomically. >> Even if "self.cnt = i" requires multiple instructions to perform > > For name binding > > cnt = i > > maybe, but > > self.cnt = i > > can execute arbitrary Python code (think __setattr__()). With threads I'd > rather play it safe. Computed properties that require setting multiple values are a problem that may require locking. Assignment of a single variable in an unoptimized namespace isn't completely immune to this -- in principle. Think __setitem__, __getitem__, __hash__, and __eq__. For example: >>> exec('cnt = 42; cnt = 43; cnt', NoisyNS()) __setitem__('cnt', 42) __hash__('cnt') __setitem__('cnt', 43) __hash__('cnt') __eq__('cnt', 'cnt') __getitem__('cnt') __eq__('cnt', 'cnt') It's reasonable to assume a namespace uses a built-in dict and str keys (names) -- or at least types that don't do anything unusual that introduces concurrency problems. From grant.b.edwards at gmail.com Wed Jul 5 09:51:49 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Wed, 5 Jul 2017 13:51:49 +0000 (UTC) Subject: EuroPython 2017: Free Intel Distribution for Python References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> <3b19ae5b-cff1-1285-648a-06d93d285a43@mrabarnett.plus.com> Message-ID: On 2017-07-04, MRAB wrote: > On 2017-07-04 23:05, Terry Reedy wrote: >> On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: >> >>> http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for >> >> I looked but did not find the most important thing. >> What version of Python? >> > From a brief search it appears to be Python 2.7 and Python 3.5. Just click on the link on the europython blog page. There's all sorts of info (release notes, package lists, etc.): https://software.intel.com/en-us/distribution-for-python -- Grant Edwards grant.b.edwards Yow! Of course, you at UNDERSTAND about the PLAIDS gmail.com in the SPIN CYCLE -- From blahBlah at blah.org Wed Jul 5 09:58:09 2017 From: blahBlah at blah.org (YOUR_NAME_HERE) Date: Wed, 5 Jul 2017 13:58:09 +0000 (UTC) Subject: School Management System in Python References: Message-ID: On Wed, 5 Jul 2017 06:34:26 -0700 (PDT), Sam Chats wrote: > On Wednesday, July 5, 2017 at 6:56:06 PM UTC+5:30, Thomas Nyberg wrote: > > On 07/05/2017 03:18 PM, YOUR_NAME_HERE wrote: > > > On Wed, 5 Jul 2017 13:02:36 +0000 (UTC) YOUR_NAME_HERE wrote: > > >> I can use either tsv or csv. Which one would be better? > > > > > > > > > Some people complain that tsv has problems, so maybe csv would be the way to go. > > > > > I almost always use csv personally, but it's a preference. I'm not sure > > what the problems are you're refering to, but I guess that points to > > using commas as well. Either way, it's not hard to switch between the two: > > > > import csv > > > > # Using regular commas > > with open('outfile.csv', 'w') as outfile: > > writer = csv.writer(outfile) > > writer.writerow(range(5)) > > > > # Using tabs > > with open('outfile.tsv', 'w') as outfile: > > writer = csv.writer(outfile, delimiter=' ') > > writer.writerow(range(5)) > > > > Cheers, > > Thomas > > Just curious, is it better, performance wise, to read from a text file (css or tsv) compared to reading from a binary pickle file? From rosuav at gmail.com Wed Jul 5 10:03:20 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Jul 2017 00:03:20 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 11:50 PM, eryk sun wrote: > Assignment of a single variable in an unoptimized namespace isn't > completely immune to this -- in principle. Think __setitem__, > __getitem__, __hash__, and __eq__. For example: > > >>> exec('cnt = 42; cnt = 43; cnt', NoisyNS()) > __setitem__('cnt', 42) > __hash__('cnt') > __setitem__('cnt', 43) > __hash__('cnt') > __eq__('cnt', 'cnt') > __getitem__('cnt') > __eq__('cnt', 'cnt') > > It's reasonable to assume a namespace uses a built-in dict and str > keys (names) -- or at least types that don't do anything unusual that > introduces concurrency problems. This doesn't show a potential concurrency problem. Calculating a hash on "cnt" is independent of other threads; the actual work of __setitem__ isn't visible in this view. There certainly are places where a context switch could cause problems (eg resizing the dict), but they're the dict's problem, not your Python program's - because there's no way you could acquire a lock without working with these same issues. This, btw, is why CPython has the GIL - it's the most efficient way to solve all these problems. Removing the GIL means putting explicit locks around these kinds of risk points, and explicit locks are slow, so one lock (the GIL) is way faster than many locks. ChrisA From mail at timgolden.me.uk Wed Jul 5 10:13:38 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Wed, 5 Jul 2017 15:13:38 +0100 Subject: Meta: double posts [Was: School Management System in Python] In-Reply-To: References: Message-ID: Are you posting both to python-list at python.org and to comp.lang.python -- and under different names? If you are, please use one or the other: they mirror both ways, and we're seeing double posts which the gateway thinks are different because of a different sending address. Thanks TJG From saurabh.chaturvedi63 at gmail.com Wed Jul 5 10:22:45 2017 From: saurabh.chaturvedi63 at gmail.com (saurabh chaturvedi) Date: Wed, 05 Jul 2017 14:22:45 +0000 Subject: Meta: double posts [Was: School Management System in Python] In-Reply-To: References: Message-ID: No, I am not. I'm not posting to Python-list at python.org. However, I'm using my own NNTP client I built as an exercise. Maybe it has a bug. I'll work on it. Thanks! Sam On Wed, Jul 5, 2017 at 7:49 PM, Tim Golden wrote: > Are you posting both to python-list at python.org and to comp.lang.python > -- and under different names? > > If you are, please use one or the other: they mirror both ways, and > we're seeing double posts which the gateway thinks are different because > of a different sending address. > > Thanks > > TJG > From blahBlah at blah.org Wed Jul 5 10:37:55 2017 From: blahBlah at blah.org (Sam Chats) Date: Wed, 5 Jul 2017 14:37:55 +0000 (UTC) Subject: How to write raw strings to Python Message-ID: I want to write, say, 'hello\tworld' as-is to a file, but doing f.write('hello\tworld') makes the file look like: hello world How can I fix this? Thanks in advance. Sam From eryksun at gmail.com Wed Jul 5 10:39:46 2017 From: eryksun at gmail.com (eryk sun) Date: Wed, 5 Jul 2017 14:39:46 +0000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 2:03 PM, Chris Angelico wrote: > On Wed, Jul 5, 2017 at 11:50 PM, eryk sun wrote: >> Assignment of a single variable in an unoptimized namespace isn't >> completely immune to this -- in principle. Think __setitem__, >> __getitem__, __hash__, and __eq__. For example: >> >> >>> exec('cnt = 42; cnt = 43; cnt', NoisyNS()) >> __setitem__('cnt', 42) >> __hash__('cnt') >> __setitem__('cnt', 43) >> __hash__('cnt') >> __eq__('cnt', 'cnt') >> __getitem__('cnt') >> __eq__('cnt', 'cnt') >> >> It's reasonable to assume a namespace uses a built-in dict and str >> keys (names) -- or at least types that don't do anything unusual that >> introduces concurrency problems. > > This doesn't show a potential concurrency problem. Calculating a hash > on "cnt" is independent of other threads; the actual work of > __setitem__ isn't visible in this view. There certainly are places > where a context switch could cause problems (eg resizing the dict), > but they're the dict's problem, not your Python program's - because > there's no way you could acquire a lock without working with these > same issues. The work in the above special methods is arbitrary bytecode that could do anything, and there's nothing to prevent a context switch here. The GIL provides no protection here. In principle it could be a problem, but in practice it is not. From tomuxiong at gmx.com Wed Jul 5 10:43:36 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 16:43:36 +0200 Subject: School Management System in Python In-Reply-To: References: <3b464032-460e-1baf-8ba0-bcf9473f2638@gmx.com> Message-ID: <61ef19d8-e18b-8fab-3520-f717016dfee8@gmx.com> On 07/05/2017 03:34 PM, Sam Chats wrote: > Just curious, is it better, performance wise, to read from a text file (css or tsv) compared to reading from a binary pickle file? > I honestly don't know. You should probably measure it if you're wondering. However, I don't think it's worth thinking about that at this stage. It's always best to optimize for cognitive overhead (i.e. to make code simpler and easier to understand) before making the code more complicated through optimizations. Cheers, Thomas From tomuxiong at gmx.com Wed Jul 5 10:45:52 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 5 Jul 2017 16:45:52 +0200 Subject: School Management System in Python In-Reply-To: References: Message-ID: On 07/05/2017 03:30 PM, YOUR_NAME_HERE wrote: > Hey that was simple enough! Thanks for the code! I was also considering the use of JSON. Which one would be better? > If you have hierarchical data best described by dicts/lists (in the python meaning), then json isn't a bad approach. But if you just have flat data (like name, job, address, etc.) where you just have one row per person, then something like a csv is probably easier. The best choice depends on your specific use case. Cheers, Thomas From stephen_tucker at sil.org Wed Jul 5 10:47:49 2017 From: stephen_tucker at sil.org (Stephen Tucker) Date: Wed, 5 Jul 2017 15:47:49 +0100 Subject: How to write raw strings to Python In-Reply-To: References: Message-ID: Sam, You use r'hello\tworld' The r in front of the string stands for raw and it is intended to switch off the normal escape function of a backslash. It works fine so long as the string doesn't end with a backslash. If you end the string with a backslash, as in r'hello\tworld\' you get an error message. Stephen. Virus-free. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Wed, Jul 5, 2017 at 3:37 PM, Sam Chats wrote: > I want to write, say, 'hello\tworld' as-is to a file, but doing > f.write('hello\tworld') makes the file > look like: > hello world > > How can I fix this? Thanks in advance. > > Sam > -- > https://mail.python.org/mailman/listinfo/python-list > From stephen_tucker at sil.org Wed Jul 5 10:51:50 2017 From: stephen_tucker at sil.org (Stephen Tucker) Date: Wed, 5 Jul 2017 15:51:50 +0100 Subject: How to write raw strings to Python In-Reply-To: References: Message-ID: Sam, You use f.write(r'hello\tworld') The r in front of the string stands for raw and is intended to switch off the escape function of the backslash in the string. It works fine so long as the string doesn't end with a backslash, as in f.write('hello\tworld\') If you try this, you get an error message. Stephen. Virus-free. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Wed, Jul 5, 2017 at 3:37 PM, Sam Chats wrote: > I want to write, say, 'hello\tworld' as-is to a file, but doing > f.write('hello\tworld') makes the file > look like: > hello world > > How can I fix this? Thanks in advance. > > Sam > -- > https://mail.python.org/mailman/listinfo/python-list > From saurabh.chaturvedi63 at gmail.com Wed Jul 5 11:09:10 2017 From: saurabh.chaturvedi63 at gmail.com (Sam Chats) Date: Wed, 5 Jul 2017 08:09:10 -0700 (PDT) Subject: How to write raw strings to Python In-Reply-To: References: Message-ID: On Wednesday, July 5, 2017 at 8:22:13 PM UTC+5:30, Stephen Tucker wrote: > Sam, > > You use > > f.write(r'hello\tworld') > > The r in front of the string stands for raw and is intended to switch off > the escape function of the backslash in the string. It works fine so long > as the string doesn't end with a backslash, as in > > f.write('hello\tworld\') > > If you try this, you get an error message. > > Stephen. > > > > > Virus-free. > www.avast.com > > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > On Wed, Jul 5, 2017 at 3:37 PM, Sam Chats wrote: > > > I want to write, say, 'hello\tworld' as-is to a file, but doing > > f.write('hello\tworld') makes the file > > look like: > > hello world > > > > How can I fix this? Thanks in advance. > > > > Sam > > -- > > https://mail.python.org/mailman/listinfo/python-list > > Thanks, but I've tried something similar. Actually, I want to convert a string which I receive from a NNTP server to a raw string. So if I try something like: raw = r"%s" % string_from_server It doesn't work. Regards, Sam From tjol at tjol.eu Wed Jul 5 11:20:06 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Wed, 5 Jul 2017 17:20:06 +0200 Subject: How to write raw strings to Python In-Reply-To: References: Message-ID: On 2017-07-05 17:09, Sam Chats wrote: > > Thanks, but I've tried something similar. Actually, I want to convert a string which I receive from a NNTP server to a raw string. So if I try something like: > raw = r"%s" % string_from_server > You may me looking for repr() -- Thomas From grant.b.edwards at gmail.com Wed Jul 5 11:38:41 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Wed, 5 Jul 2017 15:38:41 +0000 (UTC) Subject: How to write raw strings to Python References: Message-ID: On 2017-07-05, Sam Chats wrote: > I want to write, say, 'hello\tworld' as-is to a file, but doing > f.write('hello\tworld') makes the file look like: [...] > How can I fix this? That depends on what you mean by "as-is". Seriously. Do you want the single quotes in the file? Do you want the backslash and 't' character in the file? When you post a question like this it helps immensely to provide an example of the output you desire. -- Grant Edwards grant.b.edwards Yow! Is it 1974? What's at for SUPPER? Can I spend gmail.com my COLLEGE FUND in one wild afternoon?? From rosuav at gmail.com Wed Jul 5 12:04:51 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Jul 2017 02:04:51 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 12:39 AM, eryk sun wrote: >> This doesn't show a potential concurrency problem. Calculating a hash >> on "cnt" is independent of other threads; the actual work of >> __setitem__ isn't visible in this view. There certainly are places >> where a context switch could cause problems (eg resizing the dict), >> but they're the dict's problem, not your Python program's - because >> there's no way you could acquire a lock without working with these >> same issues. > > The work in the above special methods is arbitrary bytecode that could > do anything, and there's nothing to prevent a context switch here. The > GIL provides no protection here. In principle it could be a problem, > but in practice it is not. But what could it do? Most likely, it's going to end up mutating a dict (the core type), so unless the __setitem__ is itself maintaining complex state that needs a lock, all you've done is move the problem around, and the same solutions work. ChrisA From gordon at panix.com Wed Jul 5 12:05:20 2017 From: gordon at panix.com (John Gordon) Date: Wed, 5 Jul 2017 16:05:20 +0000 (UTC) Subject: how to add new tuple as key in dictionary? References: <344c23f8-f440-4c23-a5e6-9f93b0145e02@googlegroups.com> Message-ID: In <344c23f8-f440-4c23-a5e6-9f93b0145e02 at googlegroups.com> Ho Yeung Lee writes: > I find that list can not be key in dictionary > then find tuple can be as key > but when I add new tuple as key , got error in python 2.7 > groupkey = {(0,0): []} > groupkey[tuple([0,3])] = groupkey[tuple([0,3])] + [[0,1]] The right-hand side of your expression is rightly complaining that groupkey[(0,3)] doesn't exist. Would you expect to say a = a + 1 When a doesn't exist? Your code tries to do much the same thing. -- John Gordon A is for Amy, who fell down the stairs gordon at panix.com B is for Basil, assaulted by bears -- Edward Gorey, "The Gashlycrumb Tinies" From eryksun at gmail.com Wed Jul 5 12:24:32 2017 From: eryksun at gmail.com (eryk sun) Date: Wed, 5 Jul 2017 16:24:32 +0000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 4:04 PM, Chris Angelico wrote: > On Thu, Jul 6, 2017 at 12:39 AM, eryk sun wrote: >>> This doesn't show a potential concurrency problem. Calculating a hash >>> on "cnt" is independent of other threads; the actual work of >>> __setitem__ isn't visible in this view. There certainly are places >>> where a context switch could cause problems (eg resizing the dict), >>> but they're the dict's problem, not your Python program's - because >>> there's no way you could acquire a lock without working with these >>> same issues. >> >> The work in the above special methods is arbitrary bytecode that could >> do anything, and there's nothing to prevent a context switch here. The >> GIL provides no protection here. In principle it could be a problem, >> but in practice it is not. > > But what could it do? Most likely, it's going to end up mutating a > dict (the core type), so unless the __setitem__ is itself maintaining > complex state that needs a lock, all you've done is move the problem > around, and the same solutions work. That was my point. A namespace mapping could override __setitem__ and __getitem__ to implement a name as something like a computed property that's based on multiple values. Then if __setitem__ gets interrupted in the middle of updating this set of values, another thread that gets the computed 'property' will see a bad state. The GIL doesn't help. It would need locking to make accessing the 'property' work as an atomic operation, just like the case with regular properties. Again, I have never seen anything like this in practice. From jorge.conrado at cptec.inpe.br Wed Jul 5 12:34:26 2017 From: jorge.conrado at cptec.inpe.br (jorge.conrado at cptec.inpe.br) Date: Wed, 05 Jul 2017 13:34:26 -0300 Subject: get value from list using widget Message-ID: <352a90939e253dd9900b1cb6f85ea797@cptec.inpe.br> Hi, I would like know dow can I select and get the value from a list of values uisng widgets. Thanks, Conrado From rosuav at gmail.com Wed Jul 5 13:06:03 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Jul 2017 03:06:03 +1000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Thu, Jul 6, 2017 at 2:24 AM, eryk sun wrote: >> But what could it do? Most likely, it's going to end up mutating a >> dict (the core type), so unless the __setitem__ is itself maintaining >> complex state that needs a lock, all you've done is move the problem >> around, and the same solutions work. > > That was my point. A namespace mapping could override __setitem__ and > __getitem__ to implement a name as something like a computed property > that's based on multiple values. Then if __setitem__ gets interrupted > in the middle of updating this set of values, another thread that gets > the computed 'property' will see a bad state. The GIL doesn't help. It > would need locking to make accessing the 'property' work as an atomic > operation, just like the case with regular properties. Again, I have > never seen anything like this in practice. Sure it could. And if it does, it's *it's* responsibility to use locks - not the caller's. From the caller's point of view, it's still a single operation, and should remain so. ChrisA From saurabh.chaturvedi63 at gmail.com Wed Jul 5 13:19:31 2017 From: saurabh.chaturvedi63 at gmail.com (Sam Chats) Date: Wed, 5 Jul 2017 10:19:31 -0700 (PDT) Subject: How to write raw strings to Python In-Reply-To: References: Message-ID: <56e58d3e-9f6a-49bf-a018-b66941a20a02@googlegroups.com> On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote: > On 2017-07-05, Sam Chats wrote: > > > I want to write, say, 'hello\tworld' as-is to a file, but doing > > f.write('hello\tworld') makes the file look like: > [...] > > How can I fix this? > > That depends on what you mean by "as-is". > > Seriously. > > Do you want the single quotes in the file? Do you want the backslash > and 't' character in the file? > > When you post a question like this it helps immensely to provide an > example of the output you desire. > > -- > Grant Edwards grant.b.edwards Yow! Is it 1974? What's > at for SUPPER? Can I spend > gmail.com my COLLEGE FUND in one > wild afternoon?? I would add to add the following couple lines to a file: for i in range(5): print('Hello\tWorld') Consider the leading whitespace to be a tab. Thanks, Saurabh From gordon at panix.com Wed Jul 5 13:26:01 2017 From: gordon at panix.com (John Gordon) Date: Wed, 5 Jul 2017 17:26:01 +0000 (UTC) Subject: get value from list using widget References: <352a90939e253dd9900b1cb6f85ea797@cptec.inpe.br> Message-ID: In jorge.conrado at cptec.inpe.br writes: > Hi, > I would like know dow can I select and get the value from a list of > values uisng widgets. You haven't given us nearly enough detail to answer your question. What do you mean by "widget"? Do you mean HTML input elements such as radio buttons or drop-down lists? Or do you mean custom GUI widgets such as you might create using the Tkinter package? What do you mean by "select and get the value"? Do you mean that you want to present a widget to a user, that allows the user to select a value? Or do you actually mean that *you* want to select a value in your code? I could go on, but you get my point. We need lots more information before we can even begin to help you. -- John Gordon A is for Amy, who fell down the stairs gordon at panix.com B is for Basil, assaulted by bears -- Edward Gorey, "The Gashlycrumb Tinies" From jussi.piitulainen at helsinki.fi Wed Jul 5 13:37:38 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Wed, 05 Jul 2017 20:37:38 +0300 Subject: How to write raw strings to Python References: <56e58d3e-9f6a-49bf-a018-b66941a20a02@googlegroups.com> Message-ID: Sam Chats writes: > On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote: >> On 2017-07-05, Sam Chats wrote: >> >> > I want to write, say, 'hello\tworld' as-is to a file, but doing >> > f.write('hello\tworld') makes the file look like: >> [...] >> > How can I fix this? >> >> That depends on what you mean by "as-is". >> >> Seriously. >> >> Do you want the single quotes in the file? Do you want the backslash >> and 't' character in the file? >> >> When you post a question like this it helps immensely to provide an >> example of the output you desire. > > I would add to add the following couple lines to a file: > > for i in range(5): > print('Hello\tWorld') > > Consider the leading whitespace to be a tab. import sys lines = r''' for line in range(5): print('hello\tworld') ''' print(lines.strip()) sys.stdout.write(lines.strip()) sys.stdout.write('\n') From blahBlah at blah.org Wed Jul 5 13:43:26 2017 From: blahBlah at blah.org (Binary Boy) Date: Wed, 5 Jul 2017 17:43:26 +0000 (UTC) Subject: School Management System in Python References: Message-ID: On Wed, 5 Jul 2017 15:28:51 +0200, Thomas Nyberg wrote: > On 07/05/2017 03:18 PM, YOUR_NAME_HERE wrote: > > On Wed, 5 Jul 2017 13:02:36 +0000 (UTC) YOUR_NAME_HERE wrote: > >> I can use either tsv or csv. Which one would be better? > > > > > > Some people complain that tsv has problems, so maybe csv would be the way to go. > > > I almost always use csv personally, but it's a preference. I'm not sure > what the problems are you're refering to, but I guess that points to > using commas as well. Either way, it's not hard to switch between the two: > > import csv > > # Using regular commas > with open('outfile.csv', 'w') as outfile: > writer = csv.writer(outfile) > writer.writerow(range(5)) > > # Using tabs > with open('outfile.tsv', 'w') as outfile: > writer = csv.writer(outfile, delimiter='\t') > writer.writerow(range(5)) > > Cheers, > Thomas This will prove useful to me. Thanks! From blahBlah at blah.org Wed Jul 5 13:55:14 2017 From: blahBlah at blah.org (Binary Boy) Date: Wed, 5 Jul 2017 17:55:14 +0000 (UTC) Subject: How to write raw strings to Python References: Message-ID: On Wed, 05 Jul 2017 20:37:38 +0300, Jussi Piitulainen wrote: > Sam Chats writes: > > > On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote: > >> On 2017-07-05, Sam Chats wrote: > >> > >> > I want to write, say, 'hello\tworld' as-is to a file, but doing > >> > f.write('hello\tworld') makes the file look like: > >> [...] > >> > How can I fix this? > >> > >> That depends on what you mean by "as-is". > >> > >> Seriously. > >> > >> Do you want the single quotes in the file? Do you want the backslash > >> and 't' character in the file? > >> > >> When you post a question like this it helps immensely to provide an > >> example of the output you desire. > > > > I would add to add the following couple lines to a file: > > > > for i in range(5): > > print('Hello\tWorld') > > > > Consider the leading whitespace to be a tab. > > import sys > > lines = r''' > for line in range(5): > print('hello\tworld') > ''' > > print(lines.strip()) > > sys.stdout.write(lines.strip()) > sys.stdout.write('\n') Thanks! But will this work if I already have a string through a string variable, rather than using it directly linke you did (by declaring the lines variable)? And, will this work while writing to files? Sam From eryksun at gmail.com Wed Jul 5 14:08:40 2017 From: eryksun at gmail.com (eryk sun) Date: Wed, 5 Jul 2017 18:08:40 +0000 Subject: Python threading and sharing variables In-Reply-To: References: Message-ID: On Wed, Jul 5, 2017 at 5:06 PM, Chris Angelico wrote: > On Thu, Jul 6, 2017 at 2:24 AM, eryk sun wrote: >>> But what could it do? Most likely, it's going to end up mutating a >>> dict (the core type), so unless the __setitem__ is itself maintaining >>> complex state that needs a lock, all you've done is move the problem >>> around, and the same solutions work. >> >> That was my point. A namespace mapping could override __setitem__ and >> __getitem__ to implement a name as something like a computed property >> that's based on multiple values. Then if __setitem__ gets interrupted >> in the middle of updating this set of values, another thread that gets >> the computed 'property' will see a bad state. The GIL doesn't help. It >> would need locking to make accessing the 'property' work as an atomic >> operation, just like the case with regular properties. Again, I have >> never seen anything like this in practice. > > Sure it could. And if it does, it's *it's* responsibility to use locks > - not the caller's. From the caller's point of view, it's still a > single operation, and should remain so. I feel it's necessary to emphasize that there's nothing inherent in the CPython bytecode operations for storing and loading a name that guarantees atomicity. The namespace has to provide the guarantee that backs up a statement like "[y]ou can be confident that a single assignment will happen atomically". From jussi.piitulainen at helsinki.fi Wed Jul 5 14:32:29 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Wed, 05 Jul 2017 21:32:29 +0300 Subject: How to write raw strings to Python References: Message-ID: Binary Boy writes: > On Wed, 05 Jul 2017 20:37:38 +0300, Jussi Piitulainen wrote: >> Sam Chats writes: >> >> > On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote: >> >> On 2017-07-05, Sam Chats wrote: >> >> >> >> > I want to write, say, 'hello\tworld' as-is to a file, but doing >> >> > f.write('hello\tworld') makes the file look like: >> >> [...] >> >> > How can I fix this? >> >> >> >> That depends on what you mean by "as-is". >> >> >> >> Seriously. >> >> >> >> Do you want the single quotes in the file? Do you want the backslash >> >> and 't' character in the file? >> >> >> >> When you post a question like this it helps immensely to provide an >> >> example of the output you desire. >> > >> > I would add to add the following couple lines to a file: >> > >> > for i in range(5): >> > print('Hello\tWorld') >> > >> > Consider the leading whitespace to be a tab. >> >> import sys >> >> lines = r''' >> for line in range(5): >> print('hello\tworld') >> ''' >> >> print(lines.strip()) >> >> sys.stdout.write(lines.strip()) >> sys.stdout.write('\n') > > Thanks! But will this work if I already have a string through a string > variable, rather than using it directly linke you did (by declaring > the lines variable)? And, will this work while writing to files? Yes, it will work the same. Writing does not interpret the contents of the string. Try it - replace sys.stdout above with your file object. If you see a different result in your actual program, your string may be different than you think. Investigate that. From formisc at gmail.com Wed Jul 5 15:15:40 2017 From: formisc at gmail.com (Andrew Zyman) Date: Wed, 5 Jul 2017 15:15:40 -0400 Subject: Proper architecture In-Reply-To: <20170703001423.GA64678@cskk.homeip.net> References: <795caf9e-271d-4483-a11c-a83422e43b73@googlegroups.com> <20170703001423.GA64678@cskk.homeip.net> Message-ID: Cameron, took me some time to get to this. Waling down your comments: >...Normally I would have your DB class represent an open (or openable, if you wanted to defer that) database connection. So your main class would go: > def __init__(self, ...other args...): > self.db = DB(location="blah.sqlite") that was my intention with (classA) self.DBConnection = sql3.Connection What is puzzling me is what type should i assign to the self.DB _if_ i were to make it (self.db) a class field? I do not plan to have various DB connections, just one connection. The rest of your remarks have been accepted :) Thank you very much! On Sun, Jul 2, 2017 at 8:14 PM, Cameron Simpson wrote: > On 02Jul2017 11:02, Andrew Z wrote: > >> I'd appreciate your suggestions for a better approach to the following >> task. >> >> I have 2 files ( 2 classes). One (ClassA) has all logic related to the >> main workflow of the program. Another (DB), I like to offload all >> operations with a DB ( sql3 in this case). >> >> I'm trying to pass the connection to the main class, but having problems. >> One of them, is i can't pass the conn as a parameter to the function in one >> (ClassA.abc()), because i inherit it ( function abc() ). >> I created a self.DBConnection field, but i'm not sure if i'm on the right >> path... >> Code is gutted to highlight the problem. >> > > Unfortunately you have gutted the "writeTicks" method, making it harder to > see your intent. > > You separation is ok, but normally one would try to entire conceal the > unerlying db connection (the sqlite3 connection) from the man class. So you > wouldn't "pass the connection to the main class". > > Normally I would have your DB class represent an open (or openable, if you > wanted to defer that) database connection. So your main class would go: > > def __init__(self, ...other args...): > self.db = DB(location="blah.sqlite") > > def abc(self, reqId: int): > self.db.writeTicks(reqId) > > I wouldn't be passing in "self" (your ClassA instance) or > self.DBconnection at all. You'd only pass "self" if the "DB" instance > needed more information from ClassA; normally you'd just pass that > information to writeTicks() along with reqId, so that the DB needs no > special knowledge about ClassA. > > I've also got a bunch of fine grained remarks about your code that you can > take or leave as you see fit: > > one.py: >> from .DB import * >> > > Try to avoid importing "*". It sucks all the names from "DB" into your own > namespace. Arguably you only need the DB class itself - all the other > functionality comes with it as methods on the class. So: > > from DB import DB > > class ClassA(OtherObject): >> def __init__(self): >> self.DBConnection = sql3.Connection >> > > It isn't obvious why you need this. In my example above I'd just make a DB > instance and save it as self.db; unless you're controlling different > backends that would be all you need. > > def abc(self, reqId: int): >> DB.writeTicks(self,self.DBConnection,reqId)) >> > > Here you're calling the writeTicks method on the DB class itself. I > wouldn't be making that a class method; I'd make it an instance method on a > DB instance, so: > > self.db.writeTicks(reqId) > > unless there's more to writeTicks (which you've left out). > > DB.py: >> > > Try not to name modules that same as their classes - it leads to > confusion. I'd call it "db.py" and make the earlier import: > > from db import DB > > import sqlite3 as sql3 >> > > This feels like an pointless abbreviation. > > [...] > >> class DB(object): >> db_location = '' >> # db_location = '../DB/pairs.db' >> > > db_location appears to be a class default. These are normally treats as > one would a "constant" in other languages. Stylisticly, this normally means > you'd write the name in upper case, eg: > > DEFAULT_DB_LOCATION = '../DB/pairs.db' > > def __init__(self, location='../DB/pairs.db'): >> db_location = location >> > > And using that would normally look like this: > > def __init__(self, location=None): > if location is None: > location = self.DEFAULT_DB_LOCATION > > print(current_fn_name(),' self.db_location = >> {}'.format(db_location)) >> try: >> with open(db_location) as file: >> pass >> except IOError as e: >> print("Unable to locate the Db @ >> {}".format(db_location)) >> > > I'd just os.path.exists(db_location) myself, or outright make the db > connection immediately. > > Also, and this actually is important, error messages should got the the > program's standard error output (or to some logging system). So your print > would look like: > > print("Unable to locate the Db @ {}".format(db_location), > file=sys.stderr) > > Also, normal Python practie here would not be to issue an error message, > but to raise an exception. That way the caller gets to see the problem, and > also the caller cannot accidentally start other work in the false belief > that the DB instance has been made successfully. So better would be: > > raise ValueError("Unable to locate the Db @ {}".format(db_location)) > > def reqConnection(self): >> try: >> con = sql3.connect(self.db_location) >> con.text_factory = str >> except sql3.Error as e: >> print("Error %s:".format( e.args[0])) >> sys.exit(1) >> > > It is generally bad for a class method (or, usually, any funtion) to abort > the program. Raise an exception; that way (a) the caller gets to see the > actual cause of the problem and (b) the caller can decide to abort or try > to recover and (c) if the caller does nothing the program will abort on its > own, doing this for free. > > Effectively you have embedded "polciy" inside your reqConnection method, > generally unwelcome - it removes the ability for the caller to implement > their own policy. And that is an architectural thing (where the policy > lies). > > return con >> > > The easy way to raise the exception here is just to not try/except at all, > thus: > > def reqConnection(self): > return sql3.connect(self.db_location) > > or if you really need that text_factory: > > def reqConnection(self): > con = sql3.connect(self.db_location) > con.text_factory = str > return con > > def write(self, con : sql3.Connection, tickId: int): >> con.execute( blah) >> > > However I'd make the connection a singleton attribute of the DB class. So > I'd usually have __init__ make the connection immediately (which saves you > having to "probe" the location: > > def __init__(self, ...): > ... > self.con = sql3.connect(self.db_location) > > and then write() would go: > > def write(self, tickId: int): > self.con.execute(blah) > > and as you can see that _removes_ any need to pass the connection back to > the caller - you don't need to expose an reqConnection method at all, or > manage it in the caller. Instead, ClassA can just store the DB instance > itself, and let DB look after all the specifics. That is exactly the kind > of thing class encapsulation is meant to achieve: the caller (Class A) can > wash its hands of the mechanisms, which are not its problem. > > Cheers, > Cameron Simpson > From tjreedy at udel.edu Wed Jul 5 17:08:50 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Jul 2017 17:08:50 -0400 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> <3b19ae5b-cff1-1285-648a-06d93d285a43@mrabarnett.plus.com> Message-ID: On 7/5/2017 9:51 AM, Grant Edwards wrote: > On 2017-07-04, MRAB wrote: >> On 2017-07-04 23:05, Terry Reedy wrote: >>> On 7/4/2017 10:22 AM, M.-A. Lemburg wrote: >>> >>>> http://blog.europython.eu/post/162590522362/europython-2017-free-intel-distribution-for >>> >>> I looked but did not find the most important thing. >>> What version of Python? And it is really is not there *on that page*. >> From a brief search it appears to be Python 2.7 and Python 3.5. > > Just click on the link on the europython blog page. What link? The first screen has 7 links to EuroPython and no obvious links to Intel. Oh, the headline is a cleverly disguised link that does not look like a link until one mouses over it. (And one a couple of screenfulls farther on.) A writer who was trying to be informative rather than write a come-on-in like a used-car salesman would have given such essential information right up front. A 2.7 compiler has 0 interest for me, so I scanned the page for numbers. -- Terry Jan Reedy From pavol.lisy at gmail.com Wed Jul 5 17:11:15 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Wed, 5 Jul 2017 23:11:15 +0200 Subject: How to write raw strings to Python In-Reply-To: References: Message-ID: On 7/5/17, Binary Boy wrote: > On Wed, 05 Jul 2017 20:37:38 +0300, Jussi Piitulainen wrote: >> Sam Chats writes: >> >> > On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote: >> >> On 2017-07-05, Sam Chats wrote: >> >> >> >> > I want to write, say, 'hello\tworld' as-is to a file, but doing >> >> > f.write('hello\tworld') makes the file look like: >> >> [...] >> >> > How can I fix this? >> >> >> >> That depends on what you mean by "as-is". >> >> >> >> Seriously. >> >> >> >> Do you want the single quotes in the file? Do you want the backslash >> >> and 't' character in the file? >> >> >> >> When you post a question like this it helps immensely to provide an >> >> example of the output you desire. >> > >> > I would add to add the following couple lines to a file: >> > >> > for i in range(5): >> > print('Hello\tWorld') >> > >> > Consider the leading whitespace to be a tab. >> >> import sys >> >> lines = r''' >> for line in range(5): >> print('hello\tworld') >> ''' >> >> print(lines.strip()) >> >> sys.stdout.write(lines.strip()) >> sys.stdout.write('\n') > > Thanks! But will this work if I already have a string through a string > variable, rather than using it directly linke you did (by declaring the > lines variable)? > And, will this work while writing to files? > > Sam If I understand you well then no. >>> a = '%s' % 'a\tb' # we have string with tab (similar as we expect from NNTP server?) >>> print(a) # this is not what you like to have in file a b >>> print(repr(a)) # maybe this is conversion you need 'a\tb' >>> print(repr(a)[1:-1]) # or maybe this a\tb From tjreedy at udel.edu Wed Jul 5 17:13:46 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 5 Jul 2017 17:13:46 -0400 Subject: get value from list using widget In-Reply-To: <352a90939e253dd9900b1cb6f85ea797@cptec.inpe.br> References: <352a90939e253dd9900b1cb6f85ea797@cptec.inpe.br> Message-ID: On 7/5/2017 12:34 PM, jorge.conrado at cptec.inpe.br wrote: > I would like know dow can I select and get the value from a list of > values uisng widgets. One way is to learn tkinter and then learn to use the Listbox widget. The doc references a couple of decent tutorial web sites. Stackoverflow has many good tkinter examples (in the answers, not the questions ;-). -- Terry Jan Reedy From grant.b.edwards at gmail.com Wed Jul 5 17:51:11 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Wed, 5 Jul 2017 21:51:11 +0000 (UTC) Subject: EuroPython 2017: Free Intel Distribution for Python References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> <3b19ae5b-cff1-1285-648a-06d93d285a43@mrabarnett.plus.com> Message-ID: On 2017-07-05, Terry Reedy wrote: > On 7/5/2017 9:51 AM, Grant Edwards wrote: > >> Just click on the link on the europython blog page. > > What link? The first screen has 7 links to EuroPython and no obvious > links to Intel. > > Oh, the headline is a cleverly disguised link that does not look like a > link until one mouses over it. (And one a couple of screenfulls farther > on.) Yup. It's annoying. That trick of hiding links has become quite fashionable -- I don't know why. Somebody must think it's cute or clever. Mostly, it's just annoying. :) -- Grant Edwards grant.b.edwards Yow! My mind is making at ashtrays in Dayton ... gmail.com From greg.ewing at canterbury.ac.nz Wed Jul 5 19:47:20 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Thu, 06 Jul 2017 11:47:20 +1200 Subject: School Management System in Python In-Reply-To: References: <3b464032-460e-1baf-8ba0-bcf9473f2638@gmx.com> Message-ID: > On 07/05/2017 03:18 PM, YOUR_NAME_HERE wrote: > >>Some people complain that tsv has problems, so maybe csv would be the way to go. The main downside to tsv is that it can be hard to deal with in a text editor -- the difference between tabs and spaces is not visually obvious. The only reason I can think of to want to use tsv instead of csv is that you can sometimes get away without having to quote things that would need quoting in csv. But that's not an issue in Python, since the csv module takes care of all of that for you. -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 5 19:56:30 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Thu, 06 Jul 2017 11:56:30 +1200 Subject: EuroPython 2017: Free Intel Distribution for Python In-Reply-To: References: <61d2cd6d-5481-dd01-378e-6bf80405450b@udel.edu> <3b19ae5b-cff1-1285-648a-06d93d285a43@mrabarnett.plus.com> Message-ID: Grant Edwards wrote: > That trick of hiding links has become quite > fashionable -- I don't know why. Probably the result of graphic arts people who think that appearance is everything and don't really understand the web. -- Greg From python.list at tim.thechases.com Wed Jul 5 20:24:30 2017 From: python.list at tim.thechases.com (Tim Chase) Date: Wed, 5 Jul 2017 19:24:30 -0500 Subject: School Management System in Python In-Reply-To: References: <3b464032-460e-1baf-8ba0-bcf9473f2638@gmx.com> Message-ID: <20170705192430.2268b3d4@bigbox.christie.dr> On 2017-07-06 11:47, Gregory Ewing wrote: > The only reason I can think of to want to use tsv instead > of csv is that you can sometimes get away without having > to quote things that would need quoting in csv. But that's > not an issue in Python, since the csv module takes care of > all of that for you. I work with thousands of CSV/TSV data files from dozens-to-hundreds of sources (clients and service providers) and have never encountered a 0x09-as-data needing to be escaped. So my big reason for preference is that people say "TSV" and I can work with it without a second thought. On the other hand, with "CSV", sometimes it's comma-delimited as it says on the tin. But sometimes it's pipe or semi-colon delimited while still carrying the ".csv" extension. And sometimes a subset of values are quoted. Sometimes all the values are quoted. Sometimes numeric values are quoted to distinguish between numeric-looking-string and numeric-value. Sometimes escaping is done with backslashes before the quote-as-value character. Sometimes escaping is done with doubling-up the quoting-character. Sometimes CR(0x0D) and/or NL(0x0A) characters are allowed within quoted values; sometimes they're invalid. Usually fields are quoted with double-quotes; but sometimes they're single-quoted values. Or sometimes they're either, depending on the data (much like Python's REPL prints string representations). And while, yes, Python's csv module handles most of these with no issues thanks to the "dialects" concept, I still have to determine the dialect?sometimes by sniffing, sometimes by customer/vendor specification?but it's not nearly as trivial as with open("file.txt", "rb") as fp: for row in csv.DictReader(fp, delimiter='\t'): process(row) because there's the intermediate muddling of dialect determination or specification. And that said, I have a particular longing for a world in which people actually used the US/RS/GS/FS (Unit/Record/Group/File separators; AKA 0x1f-0x1c) as defined in ASCII for exactly this purpose. Sigh. :-) -tkc From ofekmeister at gmail.com Thu Jul 6 02:53:35 2017 From: ofekmeister at gmail.com (ofekmeister at gmail.com) Date: Wed, 5 Jul 2017 23:53:35 -0700 (PDT) Subject: Privy: An easy, fast lib to password-protect your data In-Reply-To: References: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> <3d9353da-01b0-4de5-b16a-d067d20f9487@googlegroups.com> Message-ID: > The person spamming right now would be you. You just posted a link, > without any explanations, any marketing blurbs, nothing. I've explained everything as succinctly as I can in the readme. Pasting bits of it here would not benefit anyone. > Why would I use your tool instead of something established, that has > been properly audited ? say, PGP for example? Did you read the page? PGP and Privy are used for different things. A key manager could, though, use Privy to store private keys. > How do I know your one-man project has no security holes, backdoors, > or other vulnerabilities? How do I know that the encryption method > chosen by you is sound? If there is no leaked data? Privy is a thin wrapper around Cryptography's (OpenSSL) Fernet interface https://github.com/pyca/cryptography/blob/master/src/cryptography/fernet.py and https://github.com/hynek/argon2_cffi which is simply a binding to https://github.com/p-h-c/phc-winner-argon2 Privy itself is really just 40 SLOC https://github.com/ofek/privy/blob/a3d4bdb24464ad85606c1ab5e78c58ae489b0569/privy/core.py#L42-L82 > And I really dislike the description of your project ... > What does ?password-protecting? mean? Why is this not ?encrypting?? This is encryption, but specifically by means of a password. This paradigm is often tricky to get correct. https://security.stackexchange.com/questions/88984/encrypting-with-passwords-encryption-of-key-vs-data > How do you expect this to work with API keys? Encrypted keys would likely be stored in a DB somehow. Check out https://github.com/fugue/credstash From wissme at free.fr Thu Jul 6 03:08:28 2017 From: wissme at free.fr (Dan Wissme) Date: Thu, 6 Jul 2017 09:08:28 +0200 Subject: About the implementation of del in Python 3 Message-ID: <595de1eb$0$4818$426a74cc@news.free.fr> I thought that del L[i] would slide L[i+1:] one place to the left, filling the hole, but : >>> L [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] >>> id(L) 4321967496 >>> id(L[5]) # address of 50 ? 4297625504 >>> del L[2] >>> id(L[4]) # new address of 50 ? 4297625504 >>> id(L) 4321967496 So the element 50 is still at the same memory location. What del L[i] do exactly, and what is its complexity ? O(1) or O(n) ? Thanks, dan From tjreedy at udel.edu Thu Jul 6 03:29:42 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 6 Jul 2017 03:29:42 -0400 Subject: About the implementation of del in Python 3 In-Reply-To: <595de1eb$0$4818$426a74cc@news.free.fr> References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: On 7/6/2017 3:08 AM, Dan Wissme wrote: > I thought that del L[i] would slide L[i+1:] one place to the left, > filling the hole, but : > > >>> L > [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] > >>> id(L) > 4321967496 > >>> id(L[5]) # address of 50 ? > 4297625504 > >>> del L[2] > >>> id(L[4]) # new address of 50 ? > 4297625504 > >>> id(L) > 4321967496 > So the element 50 is still at the same memory location. > What del L[i] do exactly, and what is its complexity ? O(1) or O(n) ? A list is an array of references to objects that exist outside of the list. Del deleted a reference, not any of the objects. -- Terry Jan Reedy From jussi.piitulainen at helsinki.fi Thu Jul 6 03:35:27 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Thu, 06 Jul 2017 10:35:27 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: Dan Wissme writes: > I thought that del L[i] would slide L[i+1:] one place to the left, > filling the hole, but : > >>>> L > [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] >>>> id(L) > 4321967496 >>>> id(L[5]) # address of 50 ? > 4297625504 >>>> del L[2] >>>> id(L[4]) # new address of 50 ? > 4297625504 >>>> id(L) > 4321967496 > > So the element 50 is still at the same memory location. > What del L[i] do exactly, and what is its complexity ? O(1) or O(n) ? id identifies the object that is stored at that index, not the location. Locations are not objects. Consider [L[5], L[5]] where the same object is stored in two different places. In the implementation level there is some kind of reference in the internal representation of the list to the representation of the object somewhere else in memory. In the language level, the object simply is stored in two places, and that's nothing unusual. Storing or fetching or passing or returning objects around does not make copies. Incidentally, let no one point out that ids are not memory addresses. It says in the interactive help that they are (Python 3.4.0): Help on built-in function id in module builtins: id(...) id(object) -> integer Return the identity of an object. This is guaranteed to be unique among simultaneously existing objects. (Hint: it's the object's memory address.) From rosuav at gmail.com Thu Jul 6 04:15:19 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 6 Jul 2017 18:15:19 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: On Thu, Jul 6, 2017 at 5:35 PM, Jussi Piitulainen wrote: > Incidentally, let no one point out that ids are not memory addresses. > It says in the interactive help that they are (Python 3.4.0): > > Help on built-in function id in module builtins: > > id(...) > id(object) -> integer > > Return the identity of an object. This is guaranteed to be unique > among simultaneously existing objects. (Hint: it's the object's > memory address.) Sorry, not the case. Help on built-in function id in module builtins: >>> help(id) id(obj, /) Return the identity of an object. This is guaranteed to be unique among simultaneously existing objects. (CPython uses the object's memory address.) >>> help(id) Help on built-in function id in module __builtin__: id(...) >>>> help(id) Help on built-in function id in module __builtin__: id(...) Return the identity of an object: id(x) == id(y) if and only if x is y. The interactive help does not say that in any version newer than the 3.4 that you tested. The function does not return an address, it returns an identity. ChrisA From wissme at free.fr Thu Jul 6 04:51:12 2017 From: wissme at free.fr (Dan Wissme) Date: Thu, 6 Jul 2017 10:51:12 +0200 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: Le 06/07/2017 ? 09:29, Terry Reedy a ?crit : > On 7/6/2017 3:08 AM, Dan Wissme wrote: >> I thought that del L[i] would slide L[i+1:] one place to the left, >> filling the hole, but : >> >> >>> L >> [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] >> >>> id(L) >> 4321967496 >> >>> id(L[5]) # address of 50 ? >> 4297625504 >> >>> del L[2] >> >>> id(L[4]) # new address of 50 ? >> 4297625504 >> >>> id(L) >> 4321967496 > >> So the element 50 is still at the same memory location. >> What del L[i] do exactly, and what is its complexity ? O(1) or O(n) ? > > A list is an array of references to objects that exist outside of the > list. Del deleted a reference, not any of the objects. So what 'del L[i]' do exactly in memory ? Same as L.pop(i) ? with complexity O(n-i) ? dan From jussi.piitulainen at helsinki.fi Thu Jul 6 05:05:46 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Thu, 06 Jul 2017 12:05:46 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: Chris Angelico writes: > On Thu, Jul 6, 2017 at 5:35 PM, Jussi Piitulainen > wrote: >> Incidentally, let no one point out that ids are not memory addresses. >> It says in the interactive help that they are (Python 3.4.0): >> >> Help on built-in function id in module builtins: >> >> id(...) >> id(object) -> integer >> >> Return the identity of an object. This is guaranteed to be unique >> among simultaneously existing objects. (Hint: it's the object's >> memory address.) > > Sorry, not the case. > > > Help on built-in function id in module builtins: > >>>> help(id) > id(obj, /) > Return the identity of an object. > > This is guaranteed to be unique among simultaneously existing objects. > (CPython uses the object's memory address.) > >>>> help(id) > Help on built-in function id in module __builtin__: > > id(...) > >>>>> help(id) > Help on built-in function id in module __builtin__: > > id(...) > Return the identity of an object: id(x) == id(y) if and only if x is y. > > > The interactive help does not say that in any version newer than the > 3.4 that you tested. The function does not return an address, it > returns an identity. Excellent. I'm happy to withdraw the prohibition. From wissme at free.fr Thu Jul 6 05:07:26 2017 From: wissme at free.fr (Dan Wissme) Date: Thu, 6 Jul 2017 11:07:26 +0200 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: Le 06/07/2017 ? 09:29, Terry Reedy a ?crit : > On 7/6/2017 3:08 AM, Dan Wissme wrote: >> I thought that del L[i] would slide L[i+1:] one place to the left, >> filling the hole, but : >> >> >>> L >> [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100] >> >>> id(L) >> 4321967496 >> >>> id(L[5]) # address of 50 ? >> 4297625504 >> >>> del L[2] >> >>> id(L[4]) # new address of 50 ? >> 4297625504 >> >>> id(L) >> 4321967496 > >> So the element 50 is still at the same memory location. >> What del L[i] do exactly, and what is its complexity ? O(1) or O(n) ? > > A list is an array of references to objects that exist outside of the > list. Del deleted a reference, not any of the objects. So what 'del L[i]' do exactly in memory ? Same as L.pop(i) ? with complexity O(n-i) ? dan From marko at pacujo.net Thu Jul 6 05:24:41 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 12:24:41 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: <87r2xt2a06.fsf@elektro.pacujo.net> Chris Angelico : > On Thu, Jul 6, 2017 at 5:35 PM, Jussi Piitulainen > wrote: >> Incidentally, let no one point out that ids are not memory addresses. >> It says in the interactive help that they are (Python 3.4.0): >> [...] > > Sorry, not the case. > [...] > > id(...) > Return the identity of an object: id(x) == id(y) if and only if x is y. > > > The interactive help does not say that in any version newer than the > 3.4 that you tested. The function does not return an address, it > returns an identity. While talking about addresses might or might not be constructive, let me just point out that there is no outwardly visible distinction between "address" or "identity". Equally well, we could replace those words with: serial number fixed asset tag social security number fermionic quantum state face fingerprint cryptographic hash Ignoring the word that is used to talk about object identity, it would be nice to have a precise formal definition for it. For example, I know that any sound implementation of Python would guarantee: >>> def f(a): return a ... >>> a = object() >>> a is f(a) True But how do I know it? Marko From steve+python at pearwood.info Thu Jul 6 09:22:08 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Thu, 06 Jul 2017 23:22:08 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> Message-ID: <595e3981$0$1600$c3e8da3$5496439d@news.astraweb.com> On Thu, 6 Jul 2017 06:51 pm, Dan Wissme wrote: > So what 'del L[i]' do exactly in memory ? Same as L.pop(i) ? with > complexity O(n-i) ? It depends on what L is and what the value of i is. If L is a list, and i is the last index of the list, then deleting it is quick. If i is 0, then Python has to copy the entire list over by one slot, which will probably be O(n) slow. If you care about the performance of this, then you probably shouldn't use a list. Consider using collections.deque, which is optimized to be fast to insert or delete items at both ends. (But it is slower to do so in the middle.) By the way, the big O algorithmic complexity is not a language promise, so it is possible that it could change in the future, or in some other implementation. I think the only *guarantee* is that: - appending to a list is amortised O(1); - item retrieval from a list is O(1). Anything else is implementation dependent and could (but probably won't) change. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Thu Jul 6 10:10:45 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 00:10:45 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> On Thu, 6 Jul 2017 07:24 pm, Marko Rauhamaa wrote: > While talking about addresses might or might not be constructive, let me > just point out that there is no outwardly visible distinction between > "address" or "identity". Er, yes there is. Address refers to a position in space. Identity refers to the state or quality of being identical (i.e. the same). My identity remains the same as I travel from one address to another. > Equally well, we could replace those words with: > > serial number > > fixed asset tag > > social security number Those are good ID numbers. > fermionic quantum state I don't think that is, since two electrons (fermions) in different atoms can be in the same state. > face Identical twins would have something to say about that. > fingerprint In practice, the uniqueness of fingerprints is problematic, but I'll grant that at least in principle they would make good IDs. > cryptographic hash By the pigeonhole principle, not actually unique. Only probably unique, provided the number of objects given IDs is significantly smaller than the total number of hashes. E.g. if you have 2**128+1 objects using a 128-bit hash, then there must be two distinct objects with the same hash. > Ignoring the word that is used to talk about object identity, it would > be nice to have a precise formal definition for it. For example, I know > that any sound implementation of Python would guarantee: > > >>> def f(a): return a > ... > >>> a = object() > >>> a is f(a) > True > > But how do I know it? Which part is unclear? The fact that f(a) returns a, or the fact that `a is a` is true? First part is implied by Python's execution model, and the second by the definition of the `is` operator. I'm genuinely unsure what part of this you think needs a precise formal definition. (That's even putting aside that it may not be possible to give a precise formal definition of "identity". See, for example, "The Axe of my Grandfather" paradox.) -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From jussi.piitulainen at helsinki.fi Thu Jul 6 10:29:42 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Thu, 06 Jul 2017 17:29:42 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: Marko Rauhamaa writes: > While talking about addresses might or might not be constructive, let > me just point out that there is no outwardly visible distinction > between "address" or "identity". With a generational or otherwise compacting garbage collector there would be. I believe that to be a valid implementation strategy. Or you are using "address" in some abstract sense so that the "address" does not change when the internal representation of the object is moved to another location. > Ignoring the word that is used to talk about object identity, it would > be nice to have a precise formal definition for it. For example, I > know that any sound implementation of Python would guarantee: > > >>> def f(a): return a > ... > >>> a = object() > >>> a is f(a) > True > > But how do I know it? For me it's enough to know that it's the object itself that is passed around as an argument, as a returned value, as a stored value, as a value of a variable. This is the basic fact that lets me understand the behaviour and performance of programs. From rosuav at gmail.com Thu Jul 6 10:34:29 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 00:34:29 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <87r2xt2a06.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: On Thu, Jul 6, 2017 at 7:24 PM, Marko Rauhamaa wrote: > While talking about addresses might or might not be constructive, let me > just point out that there is no outwardly visible distinction between > "address" or "identity". > > Equally well, we could replace those words with: > > serial number > > fixed asset tag > > social security number > > fermionic quantum state > > face > > fingerprint > > cryptographic hash Not so. An address is a place where you can look for something. If you go to 51 Franklin Street, Boston, MA, USA, you should find the offices of the Free Software Foundation. That wouldn't be the case if you go to the FSF's serial number or fermionic quantum state. And a cryptographic hash is a function of something's value, not its identity; two identical strings have the same hash, even if they are unique objects. > Ignoring the word that is used to talk about object identity, it would > be nice to have a precise formal definition for it. For example, I know > that any sound implementation of Python would guarantee: > > >>> def f(a): return a > ... > >>> a = object() > >>> a is f(a) > True > > But how do I know it? The formal definition is that objects have identities, and that assignment (including function parameters and return values) gives you a reference to the same object. "A person just walked into the revolving door and came back out again." "Is it the same person?" "I don't know. What's the definition of identity?" Of course it's the same person. You don't need to identify that person by a social security number in order to say "the SAME PERSON came back out". You identify him/her by... identity. ChrisA From python at mrabarnett.plus.com Thu Jul 6 10:56:13 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 6 Jul 2017 15:56:13 +0100 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: On 2017-07-06 15:29, Jussi Piitulainen wrote: > Marko Rauhamaa writes: > >> While talking about addresses might or might not be constructive, let >> me just point out that there is no outwardly visible distinction >> between "address" or "identity". > > With a generational or otherwise compacting garbage collector there > would be. I believe that to be a valid implementation strategy. > > Or you are using "address" in some abstract sense so that the "address" > does not change when the internal representation of the object is moved > to another location. > >> Ignoring the word that is used to talk about object identity, it would >> be nice to have a precise formal definition for it. For example, I >> know that any sound implementation of Python would guarantee: >> >> >>> def f(a): return a >> ... >> >>> a = object() >> >>> a is f(a) >> True >> >> But how do I know it? > > For me it's enough to know that it's the object itself that is passed > around as an argument, as a returned value, as a stored value, as a > value of a variable. This is the basic fact that lets me understand the > behaviour and performance of programs. > Perhaps you should be thinking of it as passing around the end of a piece of string, the other end being tied to the object itself. :-) From marko at pacujo.net Thu Jul 6 10:59:00 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 17:59:00 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: <878tk11uiz.fsf@elektro.pacujo.net> Jussi Piitulainen : > Marko Rauhamaa writes: > >> While talking about addresses might or might not be constructive, let >> me just point out that there is no outwardly visible distinction >> between "address" or "identity". > > With a generational or otherwise compacting garbage collector there > would be. I believe that to be a valid implementation strategy. > > Or you are using "address" in some abstract sense so that the "address" > does not change when the internal representation of the object is moved > to another location. "Address" is just a word. In fact, I don't think there is any definition in the Python data model that makes use of the term. Personally, when talking about Python, I would regard "address" as an endearing synonym for "identity". >> Ignoring the word that is used to talk about object identity, it would >> be nice to have a precise formal definition for it. For example, I >> know that any sound implementation of Python would guarantee: >> >> >>> def f(a): return a >> ... >> >>> a = object() >> >>> a is f(a) >> True >> >> But how do I know it? > > For me it's enough to know that it's the object itself that is passed > around as an argument, as a returned value, as a stored value, as a > value of a variable. This is the basic fact that lets me understand > the behaviour and performance of programs. That "definition" is very circular. You haven't yet defined what is "object itself". The word "self", in partucular, looks like yet another synonym of "identity". Anyway, it would be nice to have an explicit statement in the language definition that says that passing an argument and returning a value preserve the identity. Marko From marko at pacujo.net Thu Jul 6 11:21:39 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 18:21:39 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> Message-ID: <874lup1th8.fsf@elektro.pacujo.net> Steve D'Aprano : > On Thu, 6 Jul 2017 07:24 pm, Marko Rauhamaa wrote: > >> While talking about addresses might or might not be constructive, let >> me just point out that there is no outwardly visible distinction >> between "address" or "identity". > > Er, yes there is. Address refers to a position in space. Identity > refers to the state or quality of being identical (i.e. the same). My > identity remains the same as I travel from one address to another. That sounds metaphysical. What I'm looking for is snippets of Python code that illustrate the difference. That's how you can illustrate the difference between the "==" and "is" operators: >>> ["a"] is ["a"] False >>> ["a"] == ["a"] True > Those are good ID numbers. > >> fermionic quantum state > > I don't think that is, since two electrons (fermions) in different > atoms can be in the same state. Beside the topic but (unlike the bosons) every fermion in the universe differs in at least one parameter from all the rest. In your case, they belong to different atoms. >> Ignoring the word that is used to talk about object identity, it would >> be nice to have a precise formal definition for it. For example, I know >> that any sound implementation of Python would guarantee: >> >> >>> def f(a): return a >> ... >> >>> a = object() >> >>> a is f(a) >> True >> >> But how do I know it? > > Which part is unclear? The fact that f(a) returns a, or the fact that > `a is a` is true? In fact, a is a would be a *great* start for a formal definition/requirement of the "is" operator, although you'd have to generalize it to b is b c is c etc as well. Unfortunately, when I try it, I get: >>> a is a Traceback (most recent call last): File "", line 1, in NameError: name 'a' is not defined Actually, getting the wording right in these kinds of definitions is surprisingly tricky. > First part is implied by Python's execution model, [Citation needed] > and the second by the definition of the `is` operator. [Citation needed] > I'm genuinely unsure what part of this you think needs a precise > formal definition. > > (That's even putting aside that it may not be possible to give a > precise formal definition of "identity". See, for example, "The Axe of > my Grandfather" paradox.) There are many ways to define identity: 1. Map Python's data model to that of another programming language, for example C. This technique is very common and useful. No wonder the word "address" keeps popping up. 2. List a number of formal requirements: any implementation that complies with the requirements is a valid implementation of the language. Marko From jussi.piitulainen at helsinki.fi Thu Jul 6 11:37:34 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Thu, 06 Jul 2017 18:37:34 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: MRAB writes: > On 2017-07-06 15:29, Jussi Piitulainen wrote: >> Marko Rauhamaa writes: >> >>> While talking about addresses might or might not be constructive, >>> let me just point out that there is no outwardly visible distinction >>> between "address" or "identity". >> >> With a generational or otherwise compacting garbage collector there >> would be. I believe that to be a valid implementation strategy. >> >> Or you are using "address" in some abstract sense so that the >> "address" does not change when the internal representation of the >> object is moved to another location. >> >>> Ignoring the word that is used to talk about object identity, it >>> would be nice to have a precise formal definition for it. For >>> example, I know that any sound implementation of Python would >>> guarantee: >>> >>> >>> def f(a): return a >>> ... >>> >>> a = object() >>> >>> a is f(a) >>> True >>> >>> But how do I know it? >> >> For me it's enough to know that it's the object itself that is passed >> around as an argument, as a returned value, as a stored value, as a >> value of a variable. This is the basic fact that lets me understand >> the behaviour and performance of programs. >> > Perhaps you should be thinking of it as passing around the end of a > piece of string, the other end being tied to the object itself. :-) I don't find that helpful, and I don't find myself in need of such help. Most of the time that piece of string is (those pieces of string are) just a distraction to me. They get in the way. So I *don't*. From marko at pacujo.net Thu Jul 6 11:41:41 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 18:41:41 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: <87zichzi6i.fsf@elektro.pacujo.net> Chris Angelico : > The formal definition is that objects have identities, and that > assignment (including function parameters and return values) gives you > a reference to the same object. My example didn't contain a single assignment, but a variation of your statement would make a good part in a definition of identity. > "A person just walked into the revolving door and came back out > again." "Is it the same person?" "I don't know. What's the definition > of identity?" > > Of course it's the same person. You don't need to identify that person > by a social security number in order to say "the SAME PERSON came back > out". You identify him/her by... identity. Here's how identity is dealt with in First-Order Logic: In other words, identity is mapped to the "sameness" in a domain of discourse. In Second-Order Logic, you can define identity directly: ?x ?y x = y ? ?P (P(x) ? P(y)) Programming languages are different beasts, of course, but "objects" and "identity" are such important foundational topics that you'd expect a bit more than hand-waving when defining the data model. As a good example of the style I'm looking for, take a look at: Marko From rosuav at gmail.com Thu Jul 6 11:51:18 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 01:51:18 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 12:56 AM, MRAB wrote: > Perhaps you should be thinking of it as passing around the end of a piece of > string, the other end being tied to the object itself. :-) You mean like Elbonian currency? http://dilbert.com/strip/2008-09-15 ChrisA From jussi.piitulainen at helsinki.fi Thu Jul 6 11:57:30 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Thu, 06 Jul 2017 18:57:30 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <878tk11uiz.fsf@elektro.pacujo.net> Message-ID: Marko Rauhamaa writes: > Jussi Piitulainen: > >> For me it's enough to know that it's the object itself that is passed >> around as an argument, as a returned value, as a stored value, as a >> value of a variable. This is the basic fact that lets me understand >> the behaviour and performance of programs. > > That "definition" is very circular. You haven't yet defined what is > "object itself". The word "self", in partucular, looks like yet > another synonym of "identity". Yes, I regard the identity of an object as the most *basic* thing. > Anyway, it would be nice to have an explicit statement in the language > definition that says that passing an argument and returning a value > preserve the identity. Isn't there? I think it's at least very strongly implied. From rosuav at gmail.com Thu Jul 6 12:13:54 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 02:13:54 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <878tk11uiz.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <878tk11uiz.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 12:59 AM, Marko Rauhamaa wrote: >> Or you are using "address" in some abstract sense so that the "address" >> does not change when the internal representation of the object is moved >> to another location. > > "Address" is just a word. In fact, I don't think there is any definition > in the Python data model that makes use of the term. > > Personally, when talking about Python, I would regard "address" as an > endearing synonym for "identity". Why? The word has a perfectly good meaning, and it's based on "addressability". That is, you can take the address and use it to locate the thing in question. I have an email address; you can use that email address to send information to me. When you do, you'll get in touch with the mail server by sending packets to its IP address, and the internet's routing rules will get them where they need to go. If you want to ship me physical items, you'll need to know my street address, which tells you where in Australia you can find my letter box. None of this has anything to do with my identity; I've had a number of addresses of each type, and if you use an outdated one, you won't get to me. Objects have identities in Python even if they haven't yet been assigned ID numbers. It's hard to probe this (impossible in CPython), but you can see the effects of it by messing around in Jython. ChrisA From rosuav at gmail.com Thu Jul 6 12:27:21 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 02:27:21 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <874lup1th8.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 1:21 AM, Marko Rauhamaa wrote: > Steve D'Aprano : > >> On Thu, 6 Jul 2017 07:24 pm, Marko Rauhamaa wrote: >> >>> While talking about addresses might or might not be constructive, let >>> me just point out that there is no outwardly visible distinction >>> between "address" or "identity". >> >> Er, yes there is. Address refers to a position in space. Identity >> refers to the state or quality of being identical (i.e. the same). My >> identity remains the same as I travel from one address to another. > > That sounds metaphysical. > > What I'm looking for is snippets of Python code that illustrate the > difference. > > That's how you can illustrate the difference between the "==" and "is" > operators: > > >>> ["a"] is ["a"] > False > >>> ["a"] == ["a"] > True When you have an address, you can use that to locate the thing. In C, that's pointer dereferencing. If I give you the id of a Python object, can you locate that object and find out something about it? If you can't, it's not an address. And you have to make this work in Python (the language), not just CPython (the interpreter). I'll consider an answer satisfactory if it runs on the Big Four - CPython, Jython, IronPython, and PyPy - and mainly, you need to show that it works in Jython, because that's the most different. >> Which part is unclear? The fact that f(a) returns a, or the fact that >> `a is a` is true? > > In fact, > > a is a > > would be a *great* start for a formal definition/requirement of the "is" > operator, although you'd have to generalize it to > > b is b > c is c > > etc as well. Unfortunately, when I try it, I get: > > >>> a is a > Traceback (most recent call last): > File "", line 1, in > NameError: name 'a' is not defined And oh how terrible, Python doesn't define any other operators on unassigned names either. The 'is' operator is looking at OBJECT identity, so you need to have an OBJECT. If you don't understand that part of Python's object model, I recommend learning from Ned Batchelder: https://nedbatchelder.com/text/names1.html >> First part is implied by Python's execution model, > > [Citation needed] https://docs.python.org/3/reference/executionmodel.html >> and the second by the definition of the `is` operator. > > [Citation needed] https://docs.python.org/3/reference/expressions.html#is-not > There are many ways to define identity: > > 1. Map Python's data model to that of another programming language, for > example C. This technique is very common and useful. No wonder the > word "address" keeps popping up. Very common, yes, but not so useful. Python's data model is not the same as C's, and it's easier to explain without diving into the lower level details. Objects exist. You can ask a child about object identity and you'll get sane responses. * "If I put the ball behind my back, is it still a ball?" * "If I put the ball inside this box, is it the same ball?" * "I dip this ball in water; it is now wet. Is it still the same ball?" No mention of pointers. No mention of C. > 2. List a number of formal requirements: any implementation that > complies with the requirements is a valid implementation of the > language. You want to go down that path? Okay. Start reading https://docs.python.org/3/reference/ and let me know when you're done. You may notice that it was easily able to supply the dolphins you needed above. Just don't try explaining to a novice programmer that way. It's a waste of everyone's time. ChrisA From ian.g.kelly at gmail.com Thu Jul 6 12:28:27 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Thu, 6 Jul 2017 10:28:27 -0600 Subject: About the implementation of del in Python 3 In-Reply-To: <87zichzi6i.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: On Thu, Jul 6, 2017 at 9:41 AM, Marko Rauhamaa wrote: > As a good example of the style I'm looking for, take a look at: > > Java reference types have basically the same concept of identity as Python objects, so I dug around to find what definition Java uses. This is what I came up with: """ There may be many references to the same object. Most objects have state, stored in the fields of objects that are instances of classes or in the variables that are the components of an array object. If two variables contain references to the same object, the state of the object can be modified using one variable's reference to the object, and then the altered state can be observed through the reference in the other variable. """ Also, under the reference equality operator: """ At run time, the result of == is true if the operand values are both null or both refer to the same object or array; otherwise, the result is false. The result of != is false if the operand values are both null or both refer to the same object or array; otherwise, the result is true. """ If that language were used for Python, would it suffice for you? From ofekmeister at gmail.com Thu Jul 6 12:53:45 2017 From: ofekmeister at gmail.com (ofekmeister at gmail.com) Date: Thu, 6 Jul 2017 09:53:45 -0700 (PDT) Subject: Privy: An easy, fast lib to password-protect your data In-Reply-To: References: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> <3d9353da-01b0-4de5-b16a-d067d20f9487@googlegroups.com> Message-ID: Do you better understand what Privy is for now? If so, is there anything in particular you think could be made more clear in the docs? From marko at pacujo.net Thu Jul 6 13:05:18 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 20:05:18 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> Message-ID: <87k23lpkc1.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 7, 2017 at 1:21 AM, Marko Rauhamaa wrote: >> What I'm looking for is snippets of Python code that illustrate the >> difference. >> >> That's how you can illustrate the difference between the "==" and "is" >> operators: >> >> >>> ["a"] is ["a"] >> False >> >>> ["a"] == ["a"] >> True > > When you have an address, you can use that to locate the thing. You are pulling that out of your hat (or a dictionary). Python doesn't define the concept at all (yet it manages to locate every useful thing there is). > In C, that's pointer dereferencing. If I give you the id of a Python > object, can you locate that object and find out something about it? If > you can't, it's not an address. > > And you have to make this work in Python (the language), not just > CPython (the interpreter). I'll consider an answer satisfactory if it > runs on the Big Four - CPython, Jython, IronPython, and PyPy - and > mainly, you need to show that it works in Jython, because that's the > most different. I don't follow you. A code example, please. >> Unfortunately, when I try it, I get: >> >> >>> a is a >> Traceback (most recent call last): >> File "", line 1, in >> NameError: name 'a' is not defined > > And oh how terrible, Python doesn't define any other operators on > unassigned names either. The 'is' operator is looking at OBJECT > identity, so you need to have an OBJECT. If you don't understand that > part of Python's object model, I recommend learning from Ned > Batchelder: It's enough to point to the Language Reference if you have a link. >>> First part is implied by Python's execution model, >> >> [Citation needed] > > https://docs.python.org/3/reference/executionmodel.html Sorry, which sentence? >>> and the second by the definition of the `is` operator. >> >> [Citation needed] > > https://docs.python.org/3/reference/expressions.html#is-not That one in its entirety: The operators "is" and "is not" test for object identity: x is y is true if and only if x and y are the same object. Object identity is determined using the id() function. x is not y yields the inverse truth value. That simply defines the identity with whatever is returned by the id() function. The id() function, in turn, is defined to be: Return the ?identity? of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value. which doesn't clarify anything. > Just don't try explaining to a novice programmer that way. It's a > waste of everyone's time. I believe the concept of an object is among the more difficult things for novice programmers to get. Marko From rosuav at gmail.com Thu Jul 6 13:23:56 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 03:23:56 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <87k23lpkc1.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <87k23lpkc1.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 3:05 AM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 7, 2017 at 1:21 AM, Marko Rauhamaa wrote: >>> What I'm looking for is snippets of Python code that illustrate the >>> difference. >>> >>> That's how you can illustrate the difference between the "==" and "is" >>> operators: >>> >>> >>> ["a"] is ["a"] >>> False >>> >>> ["a"] == ["a"] >>> True >> >> When you have an address, you can use that to locate the thing. > > You are pulling that out of your hat (or a dictionary). Python doesn't > define the concept at all (yet it manages to locate every useful thing > there is). That's right. Python does not use the term "address" as part of its object model. So when you ask me to define it, all I can do is look outside of Python. There might be a reason for this. Like, maybe, that Python doesn't use addresses in its object model. >> In C, that's pointer dereferencing. If I give you the id of a Python >> object, can you locate that object and find out something about it? If >> you can't, it's not an address. >> >> And you have to make this work in Python (the language), not just >> CPython (the interpreter). I'll consider an answer satisfactory if it >> runs on the Big Four - CPython, Jython, IronPython, and PyPy - and >> mainly, you need to show that it works in Jython, because that's the >> most different. > > I don't follow you. A code example, please. If I have the address of something in C, I can dereference that and get at the data therein. Can you do that in Python? If you want to claim that object identity is a form of address, you should be able to show that it can be used as one. >>>> First part is implied by Python's execution model, >>> >>> [Citation needed] >> >> https://docs.python.org/3/reference/executionmodel.html > > Sorry, which sentence? I think you're probably missing some fundamentals of how Python works, so have a read of the whole page. >> https://docs.python.org/3/reference/expressions.html#is-not > > That one in its entirety: > > The operators "is" and "is not" test for object identity: x is y is > true if and only if x and y are the same object. Object identity is > determined using the id() function. x is not y yields the inverse > truth value. > > That simply defines the identity with whatever is returned by the id() > function. No; it defines object identity as "are the same object", relying on our external comprehension of identity. It then says that you can coalesce the identity to an integer using the id() function. > The id() function, in turn, is defined to be: > > Return the ?identity? of an object. This is an integer which is > guaranteed to be unique and constant for this object during its > lifetime. Two objects with non-overlapping lifetimes may have the > same id() value. > > which doesn't clarify anything. If you start with object identity, the id() function makes sense given the above definition. You cannot, however, use the id() function to define object identity, since id values can be reused. >> Just don't try explaining to a novice programmer that way. It's a >> waste of everyone's time. > > I believe the concept of an object is among the more difficult things > for novice programmers to get. Maybe, for certain technical definitions of "object" (eg in Python, everything is an object, but in JavaScript, an object is a mapping type, as distinct from an array, except that an array is an object too, and... yeah); but the concept of a "thing" is fairly well understood. I have explained Python's object model to people using physical things on my desk, and never had anyone go "but what's the address of that thing". People grok Python's objects just fine when you relate them to real-world objects. I can explain merge sort using a deck of cards and my desk, I can explain the stack using the same tools, and I can even explain stuff using vanilla-flavoured sugar - because Python parallels our world beautifully. (So do many other languages. I've explained JS to people the same way. And I've never explained JS by starting with C.) ChrisA From rhodri at kynesim.co.uk Thu Jul 6 13:26:18 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 6 Jul 2017 18:26:18 +0100 Subject: Privy: An easy, fast lib to password-protect your data In-Reply-To: References: <9babf2d5-5bfe-4096-bc6b-55f6e22590bb@googlegroups.com> <3d9353da-01b0-4de5-b16a-d067d20f9487@googlegroups.com> Message-ID: <61dde30b-d5d8-61b3-d107-322b880dea76@kynesim.co.uk> On 06/07/17 17:53, ofekmeister at gmail.com wrote: > Do you better understand what Privy is for now? If so, is there anything in particular you think could be made more clear in the docs? > I think the point is that you failed to include any context in your advert. An unadorned link in a post will gather interest only from the sort of people who haven't yet collected enough viruses to learn not to click on untrusted links. While these may be the people who need your product, they probably won't use it. The rest of us remain ignorant but safe. Then you went and bumped it, which is just plain rude. -- Rhodri James *-* Kynesim Ltd From marko at pacujo.net Thu Jul 6 13:38:49 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 20:38:49 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: <87fue9pis6.fsf@elektro.pacujo.net> Ian Kelly : > On Thu, Jul 6, 2017 at 9:41 AM, Marko Rauhamaa wrote: >> As a good example of the style I'm looking for, take a look at: >> >> > > Java reference types have basically the same concept of identity as > Python objects, so I dug around to find what definition Java uses. Good for you! > [...] > If that language were used for Python, would it suffice for you? Unfortunately, the Java definition, which does a good job elsewhere, fails here. Maybe its suggestive of the difficulty of the topic. Notice that Scheme refers directory to conventional RAM: Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations. A string, for example, denotes as many locations as there are characters in the string. (These locations need not correspond to a full machine word.) [...] The eqv? procedure returns #t if: [...] * obj1 and obj2 are pairs, vectors, or strings that denote the same locations in the store (section 3.4). Marko From marko at pacujo.net Thu Jul 6 13:42:11 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 06 Jul 2017 20:42:11 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> Message-ID: <878tk1pimk.fsf@elektro.pacujo.net> Marko Rauhamaa : > Notice that Scheme refers directory to conventional RAM: s/directory/directly/ From steve+python at pearwood.info Thu Jul 6 14:12:58 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 04:12:58 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> Message-ID: <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 01:21 am, Marko Rauhamaa wrote: > Steve D'Aprano : > >> On Thu, 6 Jul 2017 07:24 pm, Marko Rauhamaa wrote: >> >>> While talking about addresses might or might not be constructive, let >>> me just point out that there is no outwardly visible distinction >>> between "address" or "identity". >> >> Er, yes there is. Address refers to a position in space. Identity >> refers to the state or quality of being identical (i.e. the same). My >> identity remains the same as I travel from one address to another. > > That sounds metaphysical. A concrete example of a person moving from one address to another is about as far from metaphysical as it is possible to get. You've done this repeatedly over the years, insisting that Python needs a formal definition of identity and claiming that the definitions given are circular, based on what seems to me to be metaphysical and vague criticisms. In practice, none of your questions about identity seem to make the slightest bit of difference to Python programming. The Python concept of identity, as informal as it might be, works, and leads to a consistent and understandable programming model. I still don't understand what you get out of this. Will it make you a better programmer? No. Will it help you understand Python code better? No. Will it help you predict the result of calling the `is` operator? No, you seem to be perfectly able to correctly predict the result when required. So apart from the chance that you're just trolling us for fun, I don't understand what possible motive you have or why you think this is an important question. But on the possibility that you are genuinely interested in the answers, I'll try to answer your questions below. > What I'm looking for is snippets of Python code that illustrate the > difference. The difference between *what*? Your statement here is too vague for me to understand, but I'll take a guess: you want some code demonstrating the difference between address and identity. Well, for starters, that's a category error: address and identity are not the same kind of thing, and so cannot be compared directly. An address is a concrete location or place, in other words a physical position in some space, while identity is the abstract state or quality of being identical (sameness), in other words a state of being. In full generality, I doubt that there is any definition of "identity" which can be fully satisfactory, but in the context of Python we can be more specific: Nominally two objects are identical (have the same identity) if they are one and the same object, and are not identical if they are different objects. Python doesn't provide any functions for getting the address of objects, in fact the Python execution model doesn't have a concept of an object's physical location. But we can model location by considering the index of an object in a list or array as a kind of address: a = [1, 2, None] b = [1, 2, 3, 4, 5, 6, 7, 8, None] # Address of None, within a assert a.find(None) == 2 # Address of None, within b assert b.find(None) == 8 # Are these two references to None references to the same object? assert a[2] is b[8] There is no function or operator or statement in Python that returns the identity of an object. Such a thing cannot exist: identity refers to a state of being, like "hot" or "cold" or "dead" or "alive", it is not in and of itself a value that can be returned any more than I can hold "alive" in my hand. It is an abstractum (an abstract thing): https://en.wikipedia.org/wiki/Abstract_and_concrete Treating identity as a thing itself is an example of the fallacy of reification: https://en.wikipedia.org/wiki/Reification_%28fallacy%29 Hence: Abstract: identity (of a certain thing, e.g. the identity of a person) Concrete: a specific thing (e.g. a specific person) And in the context of Python: Abstract: object identity Concrete: this particular object What Python does provide is a function to compare (nominally) two values and return True if they refer to the same object in the Python virtual machine, otherwise return False. Namely the `is` operator. Abstract: do the two operands have the same identity? Concrete: are the two operands the same object? (Note that the Python VM is an abstraction which is emulated by the interpreter. The Python VM operates in terms of objects, but the interpreter may be written in a language without objects. Fundamentally computers do nothing but flip bits, so it is abstractions all the way down. And even flipping bits is an abstraction.) Python also provides a function "id()" which returns a ID number that uniquely distinguishes the object from any other object, providing the two exist at the same time. The ID number is not the same thing as the object itself, nor is it the "identity" of the object. It's just a label for the object, one which can cease to be valid once the object ceases to exist. You are not your social security number, or tax number, or your membership number at the local branch of the Loyal Order of Water Buffalo Lodge. They are just labels that represent you in a compact form. > That's how you can illustrate the difference between the "==" and "is" > operators: > > >>> ["a"] is ["a"] > False > >>> ["a"] == ["a"] > True > >> Those are good ID numbers. >> >>> fermionic quantum state >> >> I don't think that is, since two electrons (fermions) in different >> atoms can be in the same state. > > Beside the topic but (unlike the bosons) every fermion in the universe > differs in at least one parameter from all the rest. In your case, they > belong to different atoms. "The atom you are bound to" is not a quantum state. That's more like location, I guess. The prohibition of fermions being in the same state implies "at the same time, in the same place". In any case, I see your point, and quibbling about fermion state isn't really shedding light on anything. >>> Ignoring the word that is used to talk about object identity, it would >>> be nice to have a precise formal definition for it. For example, I know >>> that any sound implementation of Python would guarantee: >>> >>> >>> def f(a): return a >>> ... >>> >>> a = object() >>> >>> a is f(a) >>> True >>> >>> But how do I know it? >> >> Which part is unclear? The fact that f(a) returns a, or the fact that >> `a is a` is true? > > In fact, > > a is a > > would be a *great* start for a formal definition/requirement of the "is" > operator, although you'd have to generalize it to > > b is b > c is c > > etc as well. Why do you have to generalise it? The name "a" here stands in for any reference to any object. If I give you the mathematical equation: 2x + 3x = 5x you should understand that it holds true for any x. I don't have to generalise it to: 2y + 3y = 5y 2z + 3z = 5z 2a + 3a = 5a 2b + 3b = 5b etc. > Unfortunately, when I try it, I get: > > >>> a is a > Traceback (most recent call last): > File "", line 1, in > NameError: name 'a' is not defined Now you're just trolling. If "a" is not defined, of course it isn't identical to itself, since it doesn't exist. If there's no actual object, there's nothing to be identical. "The fourth side of this triangle is longer than the hypotenuse" is likewise a error. Triangles don't have four sides, the fourth side does not exist, and it is nonsensical to describe it as either longer or shorter than the hypotenuse. Binary valued logic has difficulty with statements such as these, so-called "vacuous truths". Sometimes it is better to just raise an exception, as Python does, or take a third option: Question: Is the fourth side of this triangle longer than the hypotenuse? Answer: Mu. https://en.wikipedia.org/wiki/Mu_%28negative%29 > Actually, getting the wording right in these kinds of definitions is > surprisingly tricky. > >> First part is implied by Python's execution model, > > [Citation needed] Good question! I thought this was covered in the execution model page of the docs: https://docs.python.org/3/reference/executionmodel.html but I was wrong, it doesn't seem to describe the semantics being described. Nor does the section on function call semantics: https://docs.python.org/3/reference/expressions.html#calls although it comes close: "Otherwise, the value of the argument is placed in the slot" Unfortunately that's insufficient. The FAQs add a little more information: https://docs.python.org/2/faq/programming.html#how-do-i-write-a-function-with-output-parameters-call-by-reference I think, unless I've missed something, that the current documentation doesn't define the way arguments are bound to function formal parameters. I know what the answer is: parameters are passed using the same semantics as CLU, namely what Barbara Liskov named "pass by sharing" or "pass by object sharing". Python luminary Fredrik Luhn (the Effbot) prefers the term "pass by object" or "call by object": http://effbot.org/zone/call-by-object.htm Here's the function call again: >>> >>> def f(a): return a >>> ... >>> >>> a = object() >>> >>> a is f(a) >>> True Let's follow the steps taken by the interpreter: (1) The function "f" is defined and bound to the name "f". (2) A value of type "object" is instantiated, and bound to the name "a". (3) The name "a" is looked up, returning that same object instantiated in step 2 above. Call this "the left hand operand". (4) The name "f" is looked up, returning the function object defined in step 1. Call this "function f". (5) The name "a" is looked up again, returning the same object instantiated in step 2 above. Call this "the function argument". (6) The function f is called with the function argument as its argument. (7) The function argument is bound to the function f's formal parameter "a", creating a local variable in the function f's namespace. [Note: this example would have been less confusing if the parameter "a" was given a name different from the outer variable "a". The two "a"'s are different variables.] (8) The function f looks up the local variable "a" (which is the formal parameter "a", defined in the parameter list), which is currently bound to the object instantiated in step 2 above, and returns that object. Call this "the right hand operand". (9) Finally the `is` operator compares the left hand operator to the right hand operator, sees that they are the same object, and returns True. Unfortunately I don't think this is officially documented as such, or at least I haven't found it, so the "documentation" is the reference implementation, namely the CPython interpreter. Documentation patches are welcome. >> and the second by the definition of the `is` operator. > > [Citation needed] Now you're taking the piss. Do I have to look everything up for you? https://docs.python.org/3/reference/expressions.html#is "x is y is true if and only if x and y are the same object." >> I'm genuinely unsure what part of this you think needs a precise >> formal definition. >> >> (That's even putting aside that it may not be possible to give a >> precise formal definition of "identity". See, for example, "The Axe of >> my Grandfather" paradox.) > > There are many ways to define identity: > > 1. Map Python's data model to that of another programming language, for > example C. This technique is very common and useful. No wonder the > word "address" keeps popping up. That doesn't define identity. It also has the flaw that Python the language is independent of any specific implementation written in a particular language. For instance, in CPython, objects have a consistent and stable memory location during their lifetime, but that is not the case for Jython, IronPython or PyPy. > 2. List a number of formal requirements: any implementation that > complies with the requirements is a valid implementation of the > language. You forgot the most practical way: 3. Follow the example of the reference implementation. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From tjreedy at udel.edu Thu Jul 6 14:16:57 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 6 Jul 2017 14:16:57 -0400 Subject: About the implementation of del in Python 3 In-Reply-To: <87zichzi6i.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: On 7/6/2017 11:41 AM, Marko Rauhamaa wrote: > Chris Angelico : > >> The formal definition is that objects have identities, and that >> assignment (including function parameters and return values) gives you >> a reference to the same object. > > My example didn't contain a single assignment, but a variation of your > statement would make a good part in a definition of identity. > >> "A person just walked into the revolving door and came back out >> again." "Is it the same person?" "I don't know. What's the definition >> of identity?" >> >> Of course it's the same person. You don't need to identify that person >> by a social security number in order to say "the SAME PERSON came back >> out". You identify him/her by... identity. > > Here's how identity is dealt with in First-Order Logic: > > > > In other words, identity is mapped to the "sameness" in a domain of > discourse. > > In Second-Order Logic, you can define identity directly: > > ?x ?y x = y ? ?P (P(x) ? P(y)) > > > Programming languages are different beasts, of course, but "objects" and > "identity" are such important foundational topics that you'd expect a > bit more than hand-waving when defining the data model. > > As a good example of the style I'm looking for, take a look at: > > > > > Marko > -- Terry Jan Reedy From tjreedy at udel.edu Thu Jul 6 14:24:17 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 6 Jul 2017 14:24:17 -0400 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: Sorry, finger twitch. Wish there were a minute grace period to recall such mistakes. On 7/6/2017 2:16 PM, Terry Reedy wrote: -- Terry Jan Reedy From nathan.ernst at gmail.com Thu Jul 6 14:56:12 2017 From: nathan.ernst at gmail.com (Nathan Ernst) Date: Thu, 6 Jul 2017 13:56:12 -0500 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: In Python, "==" is not a reference equality operator (and I hate Java for their misuse of the operator), so I absolutely disagree with using the Java description to describe Python's "==" operator, primarily because, well, it's wrong. Simple example: With Python 3.5.2 (should hold for any version 2.4 or greater): >>> a = 1 >>> b = 1 >>> a == b True >>> a is b True >>> c = 1000 >>> d = 1000 >>> c == d True >>> c is d False The "==" operator is testing for equality or equivalence. The "is" operator is testing for identity. Are these 2 things/names/references the same *instance*.? Python internalizes small integers (IIRC, the range [0, 100] is internalized). Although integers in Python are immutable, they're not always the same *instance*. Hence, identity and equality/equivalence are separate. Identity checking in Python (or at least CPython) is faster than equivalence (can test the pointer to the underlying CPython object), but may lead to unexpected results. One of the reasons you really only see the "is" operator used in production code is to test against None, because None is a read-only singleton. On Thu, Jul 6, 2017 at 11:28 AM, Ian Kelly wrote: > On Thu, Jul 6, 2017 at 9:41 AM, Marko Rauhamaa wrote: > > As a good example of the style I'm looking for, take a look at: > > > > > > Java reference types have basically the same concept of identity as > Python objects, so I dug around to find what definition Java uses. > This is what I came up with: > > """ > There may be many references to the same object. Most objects have > state, stored in the fields of objects that are instances of classes > or in the variables that are the components of an array object. If two > variables contain references to the same object, the state of the > object can be modified using one variable's reference to the object, > and then the altered state can be observed through the reference in > the other variable. > """ > > Also, under the reference equality operator: > > """ > At run time, the result of == is true if the operand values are both > null or both refer to the same object or array; otherwise, the result > is false. > The result of != is false if the operand values are both null or both > refer to the same object or array; otherwise, the result is true. > """ > > If that language were used for Python, would it suffice for you? > -- > https://mail.python.org/mailman/listinfo/python-list > From marko at pacujo.net Thu Jul 6 17:10:20 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 00:10:20 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> Message-ID: <874lupp8zn.fsf@elektro.pacujo.net> Steve D'Aprano : > An address is a concrete location or place, in other words a physical > position in some space, while identity is the abstract state or > quality of being identical (sameness), in other words a state of > being. Whether id() returns one such thing or not can't be discerned by a Python program. What's more, for any compliant implementation of id(), you can interpret the returned number as an address in some address space (whether it's useful or not to interpret it that way). > In full generality, I doubt that there is any definition of "identity" > which can be fully satisfactory, but in the context of Python we can > be more specific: > > Nominally two objects are identical (have the same identity) if they > are one and the same object, and are not identical if they are > different objects. I believe identity can be defined much better, in numerous isomorphic ways in fact. For example, we could equate each object with a sequence number (unrelated with its id()). You can define that the "None" object is in fact the natural number 0. The "False" object is in fact the natural number 1 etc for all the primordial objects. During the execution of the program, new objects are created, which simply associates characteristics to ever higher natural numbers. That kind of mathematical treatment would be precise, but alas, hardly very helpful for beginning programmers. That's why I proposed the Puppy Data Model for Python months back. It involved puppies, leashes and pegs. >> In fact, >> >> a is a >> >> would be a *great* start for a formal definition/requirement of the "is" >> operator, although you'd have to generalize it to >> >> b is b >> c is c >> >> etc as well. > > Why do you have to generalise it? The name "a" here stands in for any > reference to any object. "Any reference to any object" is difficult to define syntactically as the reference itself might perform some dunder magic. For example, you can't say that necessarily x.y is x.y >> Unfortunately, when I try it, I get: >> >> >>> a is a >> Traceback (most recent call last): >> File "", line 1, in >> NameError: name 'a' is not defined > > Now you're just trolling. No, just trying to emphasize the difference between syntax and semantics. "a is a" is a specific, legal statement in Python. We could generalize and say (wrongly) that "X is X" where X stands for any legal Python expression, is always true. However, X might be undefined or produce side effects. Or maybe you want to say: ?X ? ? ?Y ? ? ?(concat(X, "is", Y)) = 1 ? ?(X) ? ?(Y) where ? is the set of legal Python expressions. That would reduce the problem to defining the interpretation function ?. That's probably necessary anyway. >>> and the second by the definition of the `is` operator. >> >> [Citation needed] > > Now you're taking the piss. Do I have to look everything up for you? > > https://docs.python.org/3/reference/expressions.html#is > > "x is y is true if and only if x and y are the same object." I suppose what is meant is the above: ?X ? ? ?Y ? ? ?(concat(X, "is", Y)) = 1 ? ?(X) ? ?(Y) Marko From rosuav at gmail.com Thu Jul 6 17:46:07 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 07:46:07 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <874lupp8zn.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 7:10 AM, Marko Rauhamaa wrote: > Steve D'Aprano : > >> An address is a concrete location or place, in other words a physical >> position in some space, while identity is the abstract state or >> quality of being identical (sameness), in other words a state of >> being. > > Whether id() returns one such thing or not can't be discerned by a > Python program. What's more, for any compliant implementation of id(), > you can interpret the returned number as an address in some address > space (whether it's useful or not to interpret it that way). And I can interpret "Marko Rauhamaa" as a MIME-encoded IPv6 address. Does that mean it is one? 31:aae4::45ab:a16a:669a This is clearly your identity, and your address. >> In full generality, I doubt that there is any definition of "identity" >> which can be fully satisfactory, but in the context of Python we can >> be more specific: >> >> Nominally two objects are identical (have the same identity) if they >> are one and the same object, and are not identical if they are >> different objects. > > I believe identity can be defined much better, in numerous isomorphic > ways in fact. > > For example, we could equate each object with a sequence number > (unrelated with its id()). You can define that the "None" object is in > fact the natural number 0. The "False" object is in fact the natural > number 1 etc for all the primordial objects. During the execution of the > program, new objects are created, which simply associates > characteristics to ever higher natural numbers. That's pretty much how Jython assigns IDs. However, it does not assign them when objects are created, but when they're first passed to id(). >>> lists = [[], [], [], [], []] >>> lists[0] is lists[1] False >>> id(lists[0]) 2 >>> id(lists[4]) 3 >>> id(lists[2]) 4 >>> id(lists[0]) 2 >>> id(lists[1]) 5 In the Jython world, the ID of an object is an attribute of it. Objects can have identity (and be tested for identity with 'is') without having IDs. So identity is not defined in terms of ID, but ID in terms of identity. >> Why do you have to generalise it? The name "a" here stands in for any >> reference to any object. > > "Any reference to any object" is difficult to define syntactically as > the reference itself might perform some dunder magic. A simple name lookup cannot, I believe, be messed with. Nor can a literal. > For example, you can't say that necessarily > > x.y is x.y No, that's right, any more than you would expect [] is [] to be true. These are expressions that (can potentially) construct new objects. But if you can show me a situation in which "a is a" is False, I would be very surprised. > No, just trying to emphasize the difference between syntax and semantics. > > "a is a" > > is a specific, legal statement in Python. And it means "the object bound to the name 'a' is the same as the object bound to the name 'a'". If there's no such object, it's not the same as itself, in the same way that float("nan") is not equal to itself - in each case, there is no "itself" (with floats, that's a matter of value, rather than identity, but the same concept applies). > We could generalize and say (wrongly) that > > "X is X" > > where X stands for any legal Python expression, is always true. However, > X might be undefined or produce side effects. You can always overgeneralize. So? I could generalize and say that "a is a" is a specific, legal statement in APL, since it's valid and meaningful in Python. Doesn't really mean much. >> "x is y is true if and only if x and y are the same object." > > I suppose what is meant is the above: > > ?X ? ? ?Y ? ? ?(concat(X, "is", Y)) = 1 ? ?(X) ? ?(Y) .... yeah I think I prefer the version in the docs. It's pretty clear what it means. I've no idea what you're saying here with all that concatenation and stuff, but the truth is way simpler. If the two sides are the same object, 'is' returns true. Seriously, what's not clear? ChrisA From marko at pacujo.net Thu Jul 6 18:56:03 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 01:56:03 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> Message-ID: <87wp7lnpj0.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 7, 2017 at 7:10 AM, Marko Rauhamaa wrote: >> Whether id() returns one such thing or not can't be discerned by a >> Python program. What's more, for any compliant implementation of id(), >> you can interpret the returned number as an address in some address >> space (whether it's useful or not to interpret it that way). > > And I can interpret "Marko Rauhamaa" as a MIME-encoded IPv6 address. > Does that mean it is one? > > 31:aae4::45ab:a16a:669a > > This is clearly your identity, and your address. > > [...] > > I've no idea what you're saying here with all that concatenation and > stuff, but the truth is way simpler. If the two sides are the same > object, 'is' returns true. Seriously, what's not clear? Google finds a Dutch master's thesis from 2009 that gives formal semantics to a subset of Python. I was interested in seeing how it treated identity. Lo and behold: The is operator determines whether its operands are the same object. This is achieved by comparing the addresses of the operands. ? (13.7) [p. 61/91] (just sayin') Marko From greg.ewing at canterbury.ac.nz Thu Jul 6 20:34:27 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Fri, 07 Jul 2017 12:34:27 +1200 Subject: About the implementation of del in Python 3 In-Reply-To: <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > Address refers to a position in space. Not always. A PO Box number can remain the same when its owner's location in space changes. And IP addresses notoriously fail to identify physical locations. -- Greg From rosuav at gmail.com Thu Jul 6 20:59:09 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 10:59:09 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 7, 2017 at 10:34 AM, Gregory Ewing wrote: > Steve D'Aprano wrote: >> >> Address refers to a position in space. > > > Not always. A PO Box number can remain the same when its owner's > location in space changes. And IP addresses notoriously fail to > identify physical locations. A position in some form of space. An IP address gives a position within the internet; you can tease it apart to figure out which RIR, which country, which ISP, and which computer. A street address gives a position on the planet; you can tease it apart to find which country, which city, which street, and which mailbox. A PO Box is actually the same thing as a street address, but instead of telling you where its owner can be found, it tells you where *the box* can be found. A memory address tells you where some byte/word can be found in some memory space - maybe the process's virtual memory, maybe the OS's virtual memory, maybe the computer's physical memory. A web address tells you where a document can be found in the cobwebs of the internet. Not all of them are directly connected to physical locations, but all of them can be used to locate something. ChrisA From flebber.crue at gmail.com Thu Jul 6 22:29:00 2017 From: flebber.crue at gmail.com (Sayth Renshaw) Date: Thu, 6 Jul 2017 19:29:00 -0700 (PDT) Subject: Test 0 and false since false is 0 Message-ID: I was trying to solve a problem and cannot determine how to filter 0's but not false. Given a list like this ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0,{},0,0,9] I want to be able to return this list ["a","b",None,"c","d",1,False,1,3,[],1,9,{},9,0,0,0,0,0,0,0,0,0,0] However if I filter like this def move_zeros(array): l1 = [v for v in array if v != 0] l2 = [v for v in array if v == 0] return l1 + l2 I get this ['a', 'b', None, 'c', 'd', 1, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, False, 0, 0, 0, 0, 0, 0, 0] I have tried or conditions of v == False etc but then the 0's being false also aren't moved. How can you check this at once? Cheers Sayth From rantingrickjohnson at gmail.com Thu Jul 6 22:33:40 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Thu, 6 Jul 2017 19:33:40 -0700 (PDT) Subject: get value from list using widget In-Reply-To: References: <352a90939e253dd9900b1cb6f85ea797@cptec.inpe.br> Message-ID: <1c4689ae-7a17-40bf-84be-fb504956c41d@googlegroups.com> On Wednesday, July 5, 2017 at 4:15:34 PM UTC-5, Terry Reedy wrote: > On 7/5/2017 12:34 PM, jorge.conrado at cptec.inpe.br wrote: > > > I would like know dow can I select and get the value from > > a list of values uisng widgets. > > One way is to learn tkinter and then learn to use the > Listbox widget. The doc references a couple of decent > tutorial web sites. Stackoverflow has many good tkinter > examples (in the answers, not the questions ;-). Unfortunately for the occasional lurker of StackOverflow, the "top voted" answer is not always the most informative, and, in some rare cases, may even be outright bad advice. My solution is to read the entire thread, and then, utilizing my advanced analytical skills, decide which is the best. From dan at tombstonezero.net Thu Jul 6 22:46:24 2017 From: dan at tombstonezero.net (Dan Sommers) Date: Fri, 7 Jul 2017 02:46:24 -0000 (UTC) Subject: Test 0 and false since false is 0 References: Message-ID: On Thu, 06 Jul 2017 19:29:00 -0700, Sayth Renshaw wrote: > I have tried or conditions of v == False etc but then the 0's being > false also aren't moved. How can you check this at once? Maybe this will help: Python 3.5.3+ (default, Jun 7 2017, 23:23:48) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information. >>> False == 0 True >>> False is 0 False From rantingrickjohnson at gmail.com Thu Jul 6 22:46:26 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Thu, 6 Jul 2017 19:46:26 -0700 (PDT) Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> On Thursday, July 6, 2017 at 9:29:29 PM UTC-5, Sayth Renshaw wrote: > I was trying to solve a problem and cannot determine how to filter 0's but not false. > > Given a list like this > ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0,{},0,0,9] > > I want to be able to return this list > ["a","b",None,"c","d",1,False,1,3,[],1,9,{},9,0,0,0,0,0,0,0,0,0,0] > > However if I filter like this > > def move_zeros(array): > l1 = [v for v in array if v != 0] > l2 = [v for v in array if v == 0] > return l1 + l2 > > I get this > ['a', 'b', None, 'c', 'd', 1, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, False, 0, 0, 0, 0, 0, 0, 0] > > I have tried or conditions of v == False etc but then the 0's being false also aren't moved. How can you check this at once? Yep. This is a common pitfall for noobs, as no logic can explain to them why integer 0 should bool False, and integer 1 should bool True. But what's really going to cook your noodle is when you find out that any integer greater than 1 bools True. Go figure! They'll say it's for consistency sake. But i say it's just a foolish consistency. You need to learn the subtle difference between `==` and `is`. ## PYTHON 2.x >>> 1 == True True >>> 1 is True False >>> 0 == False True >>> 0 is False False From zheng.228 at hotmail.com Thu Jul 6 22:48:02 2017 From: zheng.228 at hotmail.com (zhenghao li) Date: Fri, 7 Jul 2017 02:48:02 +0000 Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: you can use the "is" for identity test. l1 = [v for v in array if not v is 0] l2 = [v for v in array if v is 0] On Jul 6, 2017, at 10:31 PM, Sayth Renshaw > wrote: I was trying to solve a problem and cannot determine how to filter 0's but not false. Given a list like this ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0,{},0,0,9] I want to be able to return this list ["a","b",None,"c","d",1,False,1,3,[],1,9,{},9,0,0,0,0,0,0,0,0,0,0] However if I filter like this def move_zeros(array): l1 = [v for v in array if v != 0] l2 = [v for v in array if v == 0] return l1 + l2 I get this ['a', 'b', None, 'c', 'd', 1, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, False, 0, 0, 0, 0, 0, 0, 0] I have tried or conditions of v == False etc but then the 0's being false also aren't moved. How can you check this at once? Cheers Sayth -- https://mail.python.org/mailman/listinfo/python-list From dan at tombstonezero.net Thu Jul 6 22:52:02 2017 From: dan at tombstonezero.net (Dan Sommers) Date: Fri, 7 Jul 2017 02:52:02 -0000 (UTC) Subject: Test 0 and false since false is 0 References: Message-ID: On Fri, 07 Jul 2017 02:48:45 +0000, Stefan Ram wrote: >>>> def isfalse( x ): > ... return x == 0 and str( type( x )) == "" > ... > Don't depend on string representations of objects, unless you know what you're doing. Do this instead: def isfalse(x): return x == 0 and type(x) is bool And why test against 0 in a function called isfalse? def isfalse(x): return x == False and type(x) is type(False) Dan From skip.montanaro at gmail.com Thu Jul 6 22:57:20 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Thu, 6 Jul 2017 21:57:20 -0500 Subject: Test 0 and false since false is 0 In-Reply-To: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> References: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> Message-ID: I was trying to solve a problem and cannot determine how to filter 0's but not false. I'm typing on my phone so can't paste a session, so I will attempt to apply the Socratic method, and ask: Do you understand why your attempts have failed so far? In what way are False and 0 the same? In what respects do they differ? Hint: think class relationships. Not trying to be difficult. Just thought I'd change things up a bit. If you come away with a bit deeper appreciation of Python's type system, you'll be a bit better off down the road. Skip From flebber.crue at gmail.com Thu Jul 6 23:00:17 2017 From: flebber.crue at gmail.com (Sayth Renshaw) Date: Thu, 6 Jul 2017 20:00:17 -0700 (PDT) Subject: Test 0 and false since false is 0 In-Reply-To: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> References: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> Message-ID: <651673b8-b6e9-4a30-a224-78f7d4ca9d43@googlegroups.com> On Friday, 7 July 2017 12:46:51 UTC+10, Rick Johnson wrote: > On Thursday, July 6, 2017 at 9:29:29 PM UTC-5, Sayth Renshaw wrote: > > I was trying to solve a problem and cannot determine how to filter 0's but not false. > > > > Given a list like this > > ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0,{},0,0,9] > > > > I want to be able to return this list > > ["a","b",None,"c","d",1,False,1,3,[],1,9,{},9,0,0,0,0,0,0,0,0,0,0] > > > > However if I filter like this > > > > def move_zeros(array): > > l1 = [v for v in array if v != 0] > > l2 = [v for v in array if v == 0] > > return l1 + l2 > > > > I get this > > ['a', 'b', None, 'c', 'd', 1, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, False, 0, 0, 0, 0, 0, 0, 0] > > > > I have tried or conditions of v == False etc but then the 0's being false also aren't moved. How can you check this at once? > > Yep. This is a common pitfall for noobs, as no logic can > explain to them why integer 0 should bool False, and integer > 1 should bool True. But what's really going to cook your > noodle is when you find out that any integer greater than 1 > bools True. Go figure! They'll say it's for consistency > sake. But i say it's just a foolish consistency. > > You need to learn the subtle difference between `==` and > `is`. > > ## PYTHON 2.x > >>> 1 == True > True > >>> 1 is True > False > >>> 0 == False > True > >>> 0 is False > False Is there an "is not" method that's not != so I can check is not false. def move_zeros(array): l1 = [v for v in array if v is False or v != 0] l2 = [v for v in array if v is not False or v == 0] return l1 + l2 Cheers Sayth From rantingrickjohnson at gmail.com Fri Jul 7 00:25:11 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Thu, 6 Jul 2017 21:25:11 -0700 (PDT) Subject: Test 0 and false since false is 0 In-Reply-To: <651673b8-b6e9-4a30-a224-78f7d4ca9d43@googlegroups.com> References: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> <651673b8-b6e9-4a30-a224-78f7d4ca9d43@googlegroups.com> Message-ID: <8b4dd46e-a602-4bac-8280-29da8b3d59da@googlegroups.com> On Thursday, July 6, 2017 at 10:00:36 PM UTC-5, Sayth Renshaw wrote: > Is there an "is not" method that's not != so I can check is not false. Maybe. Or maybe /not/. :-P" One way to find out would be to fire up your python interpretor, and do some interactive testing. Here, allow me to cinge my eyebrows: ## Python 2.x >>> 1 is 1 True >>> 1 is not 1 False >>> 1 is not 2 True I love the smell of roasted Python in the morning. From pderocco at ix.netcom.com Fri Jul 7 00:37:09 2017 From: pderocco at ix.netcom.com (Paul D. DeRocco) Date: Thu, 6 Jul 2017 21:37:09 -0700 Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: <33ECBE85BD974A3FAD2E53425B131509@PAULD> > From: Dan Sommers > > > On Thu, 06 Jul 2017 19:29:00 -0700, Sayth Renshaw wrote: > > > > I have tried or conditions of v == False etc but then the 0's being > > false also aren't moved. How can you check this at once? > > Maybe this will help: > > Python 3.5.3+ (default, Jun 7 2017, 23:23:48) > [GCC 6.3.0 20170516] on linux > Type "help", "copyright", "credits" or "license" for more > information. > >>> False == 0 > True > >>> False is 0 > False Funny how the subject line inadvertently prefigures the answer: False *isn't* 0. False *equals* 0. So just change "==" to "is" and "!=" to "is not" and it should work. Also, it can be done in a single expression, with no local variables. -- Ciao, Paul D. DeRocco Paul mailto:pderocco at ix.netcom.com From rantingrickjohnson at gmail.com Fri Jul 7 01:08:04 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Thu, 6 Jul 2017 22:08:04 -0700 (PDT) Subject: Test 0 and false since false is 0 In-Reply-To: References: <1c915f4e-6166-408c-b775-e0e44bca6da8@googlegroups.com> Message-ID: <0966ba1a-0a60-4642-b947-590e665d1276@googlegroups.com> On Thursday, July 6, 2017 at 9:57:43 PM UTC-5, Skip Montanaro wrote: > I was trying to solve a problem and cannot determine how to filter 0's but > not false. > > > I'm typing on my phone so can't paste a session [...] I have not tried any for myself, but there are a few Python installations available for the mobile platforms these days. So you may want to check that out. As for me, i've totally dumped all mobile platforms. I can remember when microsoft introduced their phones with a full installation of windows 10, and all the critics were like, "who needs a full version of windows 10 on a phone? [insert uninformed laugh here]". But then that lightbulb goes off in your head, and you realize that for all their "supposed freedom", the mobile platforms are basically prisons. Sure, they may offer a quite extensive selection of apps, but who needs another version of angry birds lifting credit card numbers, or a kitchen timer with a slighty different interface? Sheesh! And have you ever wondered why something as simple as a damned calculator needs access to your credentials, your address book, your internet controls, your email accounts, etc, etc -- why don't you just bend over a barrel and give them root access for cryin' out loud! For that reason, and many more, i have completely dumped all mobile platforms. No Apple. No stupid Andriod. Nothing. And you know what... my life is so much better now. From steve+python at pearwood.info Fri Jul 7 01:43:19 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 15:43:19 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> Message-ID: <595f1f79$0$1609$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 07:46 am, Chris Angelico wrote: > A simple name lookup cannot, I believe, be messed with. Nor can a literal. In principle, you could replace builtins or globals with a custom namespace that performed some computation on name lookup. You might even be able to insert some additional namespaces between locals and globals. Using eval or exec, you can pass an arbitrary mapping as globals and locals. But "ordinary" name lookup in locals of a function, or the standard module globals, cannot do anything funny. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 01:47:18 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 15:47:18 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <87wp7lnpj0.fsf@elektro.pacujo.net> Message-ID: <595f2068$0$1602$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 08:56 am, Marko Rauhamaa wrote: > Google finds a Dutch master's thesis from 2009 that gives formal > semantics to a subset of Python. I was interested in seeing how it > treated identity. Lo and behold: > > The is operator determines whether its operands are the same object. > This is achieved by comparing the addresses of the operands. Well, that's a fail, because the author fails to distinguish between an implementation detail of one specific Python interpreter with a language feature. But you know that, because you've been around to hear us say so probably a dozen or fifty times. So I wonder what on earth you think this proves except that the thesis author *got it wrong*. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 02:01:41 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 16:01:41 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> Message-ID: <595f23c8$0$1617$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 10:34 am, Gregory Ewing wrote: > Steve D'Aprano wrote: >> Address refers to a position in space. > > Not always. A PO Box number can remain the same when its owner's > location in space changes. But the PO box itself does not change. Pedantically, I guess it could: the post office might renumber the boxes, or the post office itself could move. Likewise, the city council might decide to change the name of my street from Foo Street to Bar Drive. But putting aside unusual circumstances like those, which add complexity but no insight, the whole point of having a PO Box is that it doesn't change address even as its owner wanders from place to place. > And IP addresses notoriously fail to > identify physical locations. That's true. But IP addresses are addresses in a virtual space, not physical space. Bringing this back to Python, it is notable that Python's execution model does not require objects to have a single physical location in memory. Which is good, since that allows people to write Python interpreters in languages like Java and the .Net CLR where objects don't have a single physical location in memory. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From palashkhaire92 at gmail.com Fri Jul 7 02:18:50 2017 From: palashkhaire92 at gmail.com (palashkhaire92 at gmail.com) Date: Thu, 6 Jul 2017 23:18:50 -0700 (PDT) Subject: how to get partition information of a hard disk with python In-Reply-To: References: Message-ID: <1f26708f-f921-4377-9cfc-907f42c13f0a@googlegroups.com> On Wednesday, September 22, 2010 at 4:01:04 AM UTC+5:30, Hellmut Weber wrote: > Hi list, > I'm looking for a possibility to access the partiton inforamtion of a > hard disk of my computer from within a python program. > > Googling I found the module 'parted' but didn't see any possibility to > get the desired information. > Is there any reasonable documentation for the parted module? > > Any idea is appreciated ;-) > > TIA > > Hellmut > > -- > Dr. Hellmut Weber mail at hellmutweber.de > Degenfeldstra?e 2 tel +49-89-3081172 > D-80803 M?nchen-Schwabing mobil +49-172-8450321 > please: No DOCs, no PPTs. why: tinyurl.com/cbgq import os os.system("fdisk -l") #you will get information about your hdd,partition From bxstover at yahoo.co.uk Fri Jul 7 02:30:31 2017 From: bxstover at yahoo.co.uk (Ben S.) Date: Thu, 6 Jul 2017 23:30:31 -0700 (PDT) Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? Message-ID: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> Can I somehow check from inside a Python script if the executing Python engine is major version v2 or v3? I am thinking about a code similar to if (os.python-majorversion<3) print hello else print (hello) Additional question: Is there a way to execute a python script with v3 python engine in v2 compatibility mode? I am thinking about a command parameter like (python.exe is v3.*): python.exe -execute_as_v2 myscript.py From marko at pacujo.net Fri Jul 7 02:35:20 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 09:35:20 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <87wp7lnpj0.fsf@elektro.pacujo.net> <595f2068$0$1602$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87shi8oitz.fsf@elektro.pacujo.net> Steve D'Aprano : > On Fri, 7 Jul 2017 08:56 am, Marko Rauhamaa wrote: > >> Google finds a Dutch master's thesis from 2009 that gives formal >> semantics to a subset of Python. I was interested in seeing how it >> treated identity. Lo and behold: >> >> The is operator determines whether its operands are the same object. >> This is achieved by comparing the addresses of the operands. > > Well, that's a fail, because the author fails to distinguish between > an implementation detail of one specific Python interpreter with a > language feature. It is not a fail. The author defines Python's semantics by mapping ("compiling") its syntax to an idealized mathematical computer. IOW, you need to "implement" Python somehow to define its semantics. The author chose to use the word address. He defined "address" as a natural number: The heap is a mapping of addresses to values. Addresses are represented by natural numbers. Values include integers, strings, functions and objects. [p. 15/91] Feel free to define alternative semantics for Python. You don't even have to use the word "address". However, your semantics will be equivalent in that they generate the same observed behavior as this master's thesis. BTW, this appendix to a Scheme standard gives semiformal semantics to Scheme: . It uses concepts such as "store", "location" and "pointer". > But you know that, because you've been around to hear us say so > probably a dozen or fifty times. So I wonder what on earth you think > this proves except that the thesis author *got it wrong*. The author would have gotten it wrong only if his semantics misbehaved. Marko From steve+python at pearwood.info Fri Jul 7 02:42:03 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 16:42:03 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <87k23lpkc1.fsf@elektro.pacujo.net> Message-ID: <595f2d3e$0$1614$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 03:05 am, Marko Rauhamaa wrote: > I believe the concept of an object is among the more difficult things > for novice programmers to get. True, but that has nothing to do with object identity. Inheritance, "is-a" versus "has-a" relationships, when to write len(x) versus x.len(), those are more troublesome than identity. In fact, I would expect that object identity ("sameness") is probably the least difficult thing for novices to understand. People have an intuitive[1] understanding of identity based on the properties and behaviour of physical objects. [1] It may actually be instinctive -- there are studies that show that even young babies express surprise when they see something that violates the intuitive properties of identity. For example, if you pass a toy in front of the baby, then behind a screen, and swap it for a different toy before showing it again, babies tend to express surprise. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 02:43:07 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 16:43:07 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 01:41 am, Marko Rauhamaa wrote: > Here's how identity is dealt with in First-Order Logic: > > > > In other words, identity is mapped to the "sameness" in a domain of > discourse. Define "sameness". > In Second-Order Logic, you can define identity directly: > > ?x ?y x = y ? ?P (P(x) ? P(y)) Translating to English: For all x, for all y, x equals y if and only if for all P (P(x) if and only if P(y)) That might be sufficient for second-order logic, but it won't do for programming. Defining if-and-only-if for functions that can return more than two values (true and false) requires having a definition of equality, which would make the definition circular. And what if there exists even a single non-deterministic function, or one that has hidden variables that may change state, or a constant function that always returns the same value? Such things are forbidden in second order logic but they exist in many programming languages. > Programming languages are different beasts, of course, but "objects" and > "identity" are such important foundational topics that you'd expect a > bit more than hand-waving when defining the data model. I wouldn't. I don't think identity is even capable of vigorous definition outside of pure mathematics, and even if it is, I don't see how it would make anyone a better programmer to have such a definition. In practice, identity is hardly important in Python, except for a few limited cases, and the prohibition against using `is` when you mean `==`. > As a good example of the style I'm looking for, take a look at: > > Neither the word "identity" nor "identical" exists on that page, so I don't see how that solves your problem. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 02:49:40 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 16:49:40 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> Message-ID: <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 03:38 am, Marko Rauhamaa wrote: > Notice that Scheme refers directory to conventional RAM: > > Variables and objects such as pairs, vectors, and strings implicitly > denote locations That implies that it is impossible to implement Scheme: - using a programming language where variables and objects may move during their lifetime; - or using a computing device without conventional memory, e.g. implementing Scheme using hydraulics, DNA computing, clockwork, or emulated in the human brain. I think that's pretty crap. It might be justifiable to define a language like C to a specific hardware implementation, but higher-level languages like Scheme should be more abstract. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From tjreedy at udel.edu Fri Jul 7 02:59:39 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 7 Jul 2017 02:59:39 -0400 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> Message-ID: On 7/7/2017 2:30 AM, Ben S. via Python-list wrote: > Can I somehow check from inside a Python script if the executing Python engine is major version v2 or v3? > > I am thinking about a code similar to > > if (os.python-majorversion<3) > print hello > else > print (hello) For this, just use print('hello'). Checkout 'from __future__ import print_function' check __future__ module for spelling. Checkout sys.version, sys.hexversion, platform module, Some import will tell you. try: import tkinter as tk pyversion = 3 except ImportError: import Tkinter as tk pyversion = 2 > > Additional question: > Is there a way to execute a python script with v3 python engine in v2 compatibility mode? > I am thinking about a command parameter like (python.exe is v3.*): > > python.exe -execute_as_v2 myscript.py No. -- Terry Jan Reedy From marko at pacujo.net Fri Jul 7 03:01:51 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 10:01:51 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87d19cohls.fsf@elektro.pacujo.net> Steve D'Aprano : > On Fri, 7 Jul 2017 03:38 am, Marko Rauhamaa wrote: > >> Notice that Scheme refers directory to conventional RAM: >> >> Variables and objects such as pairs, vectors, and strings implicitly >> denote locations > > > That implies that it is impossible to implement Scheme: > > - using a programming language where variables and objects may move > during their lifetime; > > - or using a computing device without conventional memory, e.g. > implementing Scheme using hydraulics, DNA computing, clockwork, or > emulated in the human brain. > > I think that's pretty crap. It might be justifiable to define a > language like C to a specific hardware implementation, but > higher-level languages like Scheme should be more abstract. You are misunderstanding. Your implementation doesn't have to match the abstract machine as long as it produces the same behavior. They could be defining it using a Turing machine, but that doesn't mean a Scheme runtime would necessarily need a mile-long paper tape. Marko From __peter__ at web.de Fri Jul 7 03:04:59 2017 From: __peter__ at web.de (Peter Otten) Date: Fri, 07 Jul 2017 09:04:59 +0200 Subject: Test 0 and false since false is 0 References: Message-ID: Sayth Renshaw wrote: > I was trying to solve a problem and cannot determine how to filter 0's but > not false. > > Given a list like this > ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0,{},0,0,9] > > I want to be able to return this list > ["a","b",None,"c","d",1,False,1,3,[],1,9,{},9,0,0,0,0,0,0,0,0,0,0] > > However if I filter like this > > def move_zeros(array): > l1 = [v for v in array if v != 0] > l2 = [v for v in array if v == 0] > return l1 + l2 > > I get this > ['a', 'b', None, 'c', 'd', 1, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, False, 0, 0, > [0, 0, 0, 0, 0] > > I have tried or conditions of v == False etc but then the 0's being false > also aren't moved. How can you check this at once? > > > Cheers > > Sayth Another option is to test for type(value) == int: >>> before = ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0, {},0,0,9] >>> wanted = ["a","b",None,"c","d",1,False,1,3,[],1,9, {},9,0,0,0,0,0,0,0,0,0,0] >>> after = sorted(before, key=lambda x: x == 0 and type(x) == int) >>> assert str(after) == str(wanted) >>> after ['a', 'b', None, 'c', 'd', 1, False, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] That way float values will be left alone, too: >>> sorted([0.0, 0, False, [], "x"], key=lambda x: x == 0 and type(x) == int) [0.0, False, [], 'x', 0] From steve+python at pearwood.info Fri Jul 7 03:12:00 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 17:12:00 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> Message-ID: <595f3441$0$1597$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 07:10 am, Marko Rauhamaa wrote: > Steve D'Aprano : > >> An address is a concrete location or place, in other words a physical >> position in some space, while identity is the abstract state or >> quality of being identical (sameness), in other words a state of >> being. > > Whether id() returns one such thing or not can't be discerned by a > Python program. Since the id() function isn't documented as returning an address, I'm not sure why you think that it is significant that Python programs can't discern such a thing. > What's more, for any compliant implementation of id(), > you can interpret the returned number as an address in some address > space (whether it's useful or not to interpret it that way). You could equally interpret it as: - the IQ of the person reading the output; - the number of legs on the average insect; - the number of times per day the average teenager says "I don't even"; - the number of days in a month; - the amount of memory in terabytes required to store that object; - the number of atoms of hydrogen in Moscow; or - cost in US dollars in lost productivity caused by this discussion whether it's useful or not to interpret it that way. You are free to interpret it as anything you like, whether or not it is useful is up to you. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 03:17:04 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 17:17:04 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> Message-ID: <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 07:10 am, Marko Rauhamaa wrote: > I believe identity can be defined much better, in numerous isomorphic > ways in fact. > > For example, we could equate each object with a sequence number > (unrelated with its id()). You can define that the "None" object is in > fact the natural number 0. The "False" object is in fact the natural > number 1 etc for all the primordial objects. During the execution of the > program, new objects are created, which simply associates > characteristics to ever higher natural numbers. Hmmm... interesting. You might just be on the right track here. That might even work for "identity" as required by Python. Of course you can't say "equate each object with its sequence number" since that implies that: assert None == 0 assert False == 1 assert True == 2 should all succeed. Rather we would say that we associate each object with its sequence number. Then we say two objects are the same object if they have the same sequence number. Further more, we could relax the condition that the sequence number be assigned on object creation. Eventually the sequence number will be pretty big, and constructing that number would be time consuming. Since most objects never get compared for identity, why bother pre-allocating the sequence number? It can be allocated if and when needed. Replace the term "sequence number" with "ID number" and you have IronPython and Jython. Since objects can be destroyed as well as created, when an object is destroyed, we can push its sequence number into a pool of free sequence numbers. This reuse could help prevent sequence numbers from growing arbitrarily large. Then the next object which is created can reuse one of the sequence numbers from the free pool, instead of a new one. We can go further: rather than explicitly store the sequence number, it would be nice if we could find something in the environment that already exists, that can be guaranteed to be unique, and in at least some programming languages, is guaranteed to be stable for the life time of the object (until it is destroyed). If we allocated objects in a giant array, its position in the array could be treated as its sequence number. That way we don't have to explicitly record free sequence numbers: when we allocate an object in the array, provided it goes into a free slot, it automatically reuses the appropriate sequence number. In fact, for programming environments that use a memory heap model, we don't even need the giant array... we can use the heap itself as conceptually the giant array, and the memory address as the sequence number, so long as we give up on the requirement that we start sequence numbers at 0. Of course this only works for implementations where objects can't move around in the heap. Substitute the term "sequence number" with "ID number" and you have CPython. So... We say two objects are the same object if they have the same sequence number. Replacing the term "sequence number" with "ID number", and we say: Two objects are the same object if they have the same ID number. Are you satisfied now? Can we please put this debate to bed? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From jeremiah.dodds at gmail.com Fri Jul 7 03:22:19 2017 From: jeremiah.dodds at gmail.com (Jeremiah Dodds) Date: Fri, 07 Jul 2017 03:22:19 -0400 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> (Ben S. via Python-list's message of "Thu, 6 Jul 2017 23:30:31 -0700 (PDT)") References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> Message-ID: <878tk0zp78.fsf@gmail.com> "Ben S. via Python-list" writes: > Can I somehow check from inside a Python script if the executing Python engine is major version v2 or v3? import sys sys.version_info[0] (If you just need to print() consistently, you should follow Terry's advice) From mail at timgolden.me.uk Fri Jul 7 03:23:43 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Fri, 7 Jul 2017 08:23:43 +0100 Subject: how to get partition information of a hard disk with python In-Reply-To: <1f26708f-f921-4377-9cfc-907f42c13f0a@googlegroups.com> References: <1f26708f-f921-4377-9cfc-907f42c13f0a@googlegroups.com> Message-ID: On 07/07/2017 07:18, palashkhaire92 at gmail.com wrote: > On Wednesday, September 22, 2010 at 4:01:04 AM UTC+5:30, Hellmut Weber wrote: >> Hi list, >> I'm looking for a possibility to access the partiton inforamtion of a >> hard disk of my computer from within a python program. >> >> Googling I found the module 'parted' but didn't see any possibility to >> get the desired information. >> Is there any reasonable documentation for the parted module? >> >> Any idea is appreciated ;-) >> > > > > import os > os.system("fdisk -l") > #you will get information about your hdd,partition > psutil is usually good for these sort of things: http://pythonhosted.org/psutil/#disks TJG From wissme at free.fr Fri Jul 7 03:29:37 2017 From: wissme at free.fr (Dan Wissme) Date: Fri, 7 Jul 2017 09:29:37 +0200 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: <595f385f$0$3627$426a34cc@news.free.fr> Le 06/07/2017 ? 20:56, Nathan Ernst a ?crit : > In Python, "==" is not a reference equality operator (and I hate Java for > their misuse of the operator), so I absolutely disagree with using the Java > description to describe Python's "==" operator, primarily because, well, > it's wrong. Simple example: > > With Python 3.5.2 (should hold for any version 2.4 or greater): >>>> a = 1 >>>> b = 1 >>>> a == b > True >>>> a is b > True >>>> c = 1000 >>>> d = 1000 >>>> c == d > True >>>> c is d > False Strange behavior in Python 3.6.0 >>> i = 3000 >>> j = 3000 >>> i is j False >>> n = 4000 ; m = 4000 ; n is m True dan From robin at reportlab.com Fri Jul 7 03:41:29 2017 From: robin at reportlab.com (Robin Becker) Date: Fri, 7 Jul 2017 08:41:29 +0100 Subject: About the implementation of del in Python 3 In-Reply-To: <595f2d3e$0$1614$c3e8da3$5496439d@news.astraweb.com> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <87k23lpkc1.fsf@elektro.pacujo.net> <595f2d3e$0$1614$c3e8da3$5496439d@news.astraweb.com> Message-ID: <6053d43c-0a3b-a8b9-c350-2f5cd58a9503@chamonix.reportlab.co.uk> On 07/07/2017 07:42, Steve D'Aprano wrote: > On Fri, 7 Jul 2017 03:05 am, Marko Rauhamaa wrote: > ........... > > [1] It may actually be instinctive -- there are studies that show that even > young babies express surprise when they see something that violates the > intuitive properties of identity. For example, if you pass a toy in front of > the baby, then behind a screen, and swap it for a different toy before showing > it again, babies tend to express surprise. presumably this is after they learn (or make the assumption) of object permanence. -stuffed-with-assumptions-ly yrs- Robin Becker From rosuav at gmail.com Fri Jul 7 03:45:07 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 17:45:07 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 7, 2017 at 4:43 PM, Steve D'Aprano wrote: > On Fri, 7 Jul 2017 01:41 am, Marko Rauhamaa wrote: >> In Second-Order Logic, you can define identity directly: >> >> ?x ?y x = y ? ?P (P(x) ? P(y)) > > Translating to English: > > For all x, for all y, x equals y if and only if for all P > (P(x) if and only if P(y)) > > > That might be sufficient for second-order logic, but it won't do for > programming. Defining if-and-only-if for functions that can return more than > two values (true and false) requires having a definition of equality, which > would make the definition circular. It sounds to me like this has defined equality, not identity, right? In Python, you could have x and y be two different integers with the same value, and the return values of all functions would be indistinguishable. Unless you're including the 'id' function, in which case... welcome to circular reasoning again. ChrisA From steve+python at pearwood.info Fri Jul 7 03:53:38 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 17:53:38 +1000 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> Message-ID: <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 04:30 pm, Ben S. wrote: > Can I somehow check from inside a Python script if the executing Python engine > is major version v2 or v3? Yes you can, but generally speaking you shouldn't. import sys if sys.version_info >= (3,): # the comma is important print("version 3") else: print("version 2") But keep in mind that your code must be syntactically valid for the running version regardless of the result of the test. This will **NOT** work: import sys if sys.version_info >= (3,): # the comma is important print("version 3") else: print "version 2" # Python 2 syntax Earlier I said that in general you shouldn't test for the version. Normally you should test for a specific feature, not for the version number. For example, suppose I want to use the "reduce()" function. In Python 2 it is a built-in function, but in Python 3 it is moved into the functools module. Don't do this: if sys.version_info >= (3,): from functools import reduce This is better: try: reduce except NameError: # reduce no longer defined as a built-in from functools import reduce That's now not only backwards compatible, but it is forward compatible: if Python changes in the future to bring reduce back into the built-in functions, your code will automatically keep working. Dealing with syntax changes in hybrid version 2 + 3 code is quite tricky. It can be done, but it is painful, even for experts. > Additional question: > Is there a way to execute a python script with v3 python engine in v2 > compatibility mode? I am thinking about a command parameter like (python.exe > is v3.*): > > python.exe -execute_as_v2 myscript.py No. Python 3 is always Python 3, and Python 2 is always Python 2. But what you can do is install both, and then call python2.exe myscript.py python3.exe anotherscript.py -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Fri Jul 7 03:59:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 17:59:48 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <595f385f$0$3627$426a34cc@news.free.fr> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f385f$0$3627$426a34cc@news.free.fr> Message-ID: On Fri, Jul 7, 2017 at 5:29 PM, Dan Wissme wrote: > Strange behavior in Python 3.6.0 >>>> i = 3000 >>>> j = 3000 >>>> i is j > False >>>> n = 4000 ; m = 4000 ; n is m > True Firstly, remember that immutables are allowed, but not required, to be shared. So this kind of "strange behaviour" is completely fine - and furthermore, can come and go at any time. What you're seeing here is an artifact of the interactive interpreter. Each statement or block that you enter gets compiled and executed on its own. When you do the last block, the compiler looks at the whole thing, and produces this code: >>> c = compile("n = 4000 ; m = 4000 ; n is m", "", "single") >>> c.co_consts (4000, None) >>> dis.dis(c) 1 0 LOAD_CONST 0 (4000) 2 STORE_NAME 0 (n) 4 LOAD_CONST 0 (4000) 6 STORE_NAME 1 (m) 8 LOAD_NAME 0 (n) 10 LOAD_NAME 1 (m) 12 COMPARE_OP 8 (is) 14 PRINT_EXPR 16 LOAD_CONST 1 (None) 18 RETURN_VALUE The print and return at the end are how the REPL works. The rest is your code. The compiler noticed that it needed to load the constant integer 4000 twice, so it put it into the co_consts collection once and used the same integer object each time. Armed with that information, it should be easy to see why your 3000 example returned False. Each of the assignments was compiled separately, and the compiler didn't look at previous compilations to reuse an integer. Thus the two are separate objects. The compiler COULD, if it felt like it, reuse that; conversely, a naive and inefficient compiler is welcome to generate brand new integers for n and m. Actually, I believe a compliant Python interpreter is welcome to not store integer objects in memory at all, as long as it can guarantee the correct identity semantics (eg the 'is' and 'is not' operators on integers would be defined on value, and id(x) would return x*2+1 for ints and even numbers for other objects), although I don't know of any implementations that do this. ChrisA From greg.ewing at canterbury.ac.nz Fri Jul 7 04:05:11 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Fri, 07 Jul 2017 20:05:11 +1200 Subject: About the implementation of del in Python 3 In-Reply-To: <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > In practice, identity is hardly important in Python, except for a few limited > cases, and the prohibition against using `is` when you mean `==`. On the contrary, it's a very important concept needed to make sense of the way things behave when mutation is involved. Witness e.g. the classic newbie mistake of using [[0]*5]*5 to create a matrix of zeroes. I don't know how to explain why that goes wrong without using the phrase "same object" in some way. -- Greg From greg.ewing at canterbury.ac.nz Fri Jul 7 04:05:23 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Fri, 07 Jul 2017 20:05:23 +1200 Subject: About the implementation of del in Python 3 In-Reply-To: <87zichzi6i.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: Marko Rauhamaa wrote: > In Second-Order Logic, you can define identity directly: > > ?x ?y x = y ? ?P (P(x) ? P(y)) That looks more like a definition of *equality* to me. In mathematics, everything is immutable, so there isn't really any distinction between equality and identity. -- Greg From greg.ewing at canterbury.ac.nz Fri Jul 7 04:12:22 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Fri, 07 Jul 2017 20:12:22 +1200 Subject: About the implementation of del in Python 3 In-Reply-To: <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > That implies that it is impossible to implement Scheme: > > - using a programming language where variables and objects may move during their > lifetime; > > - or using a computing device without conventional memory, e.g. implementing > Scheme using hydraulics, DNA computing, clockwork, or emulated in the human > brain. Only if you interpret the word "address" very narrowly. By the way, even the low-level notion of "address" that C programs deal with is, on most modern hardware, a *virtual* address that goes through a level of translation before it identifies a physical set of transistors, and that mapping can change as stuff gets paged in and out. So it's already an abstract concept to some extent. -- Greg From steve+python at pearwood.info Fri Jul 7 04:24:45 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 18:24:45 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f385f$0$3627$426a34cc@news.free.fr> Message-ID: <595f454e$0$1611$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 05:29 pm, Dan Wissme wrote: > Strange behavior in Python 3.6.0 > >>> i = 3000 > >>> j = 3000 > >>> i is j > False > >>> n = 4000 ; m = 4000 ; n is m > True The Python interpreter is allowed to cache integers and reuse them. The interactive interpreter sometimes does so: if you write the same int literal on the same line in the interactive interpreter, it may re-use the same object instead of creating two equal objects. You should **NEVER** use `is` when you want to check for equality. You cannot rely on Python to cache or not cache int values. Whatever it does is purely an implementation detail that is subject to change without notice. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 04:29:27 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 18:29:27 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> Message-ID: <595f4668$0$1608$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 05:45 pm, Chris Angelico wrote: > On Fri, Jul 7, 2017 at 4:43 PM, Steve D'Aprano > wrote: >> On Fri, 7 Jul 2017 01:41 am, Marko Rauhamaa wrote: >>> In Second-Order Logic, you can define identity directly: >>> >>> ?x ?y x = y ? ?P (P(x) ? P(y)) >> >> Translating to English: >> >> For all x, for all y, x equals y if and only if for all P >> (P(x) if and only if P(y)) >> >> >> That might be sufficient for second-order logic, but it won't do for >> programming. Defining if-and-only-if for functions that can return more than >> two values (true and false) requires having a definition of equality, which >> would make the definition circular. > > It sounds to me like this has defined equality, not identity, right? In mathematics, I believe that equality and identity in this sense are the same, and we could spell the mathematical operator "=" as "is" instead. Mathematicians normally use the word "identity" to refer to things like the identity function f(x) -> x, or the identity matrix [1 0][0 1] say, or the multiplicative identity (usually 1), rather than talking about "the identity of a value" since that would be redundant: the identity of the value is the value of the value which is the value. But programming languages can have values which are equal but not identical, such as 1 and 1.0. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 7 04:31:41 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 18:31:41 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> Message-ID: <595f46ee$0$1608$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 06:05 pm, Gregory Ewing wrote: > Steve D'Aprano wrote: >> In practice, identity is hardly important in Python, except for a few limited >> cases, and the prohibition against using `is` when you mean `==`. > > On the contrary, it's a very important concept needed to make > sense of the way things behave when mutation is involved. > > Witness e.g. the classic newbie mistake of using [[0]*5]*5 > to create a matrix of zeroes. I don't know how to explain > why that goes wrong without using the phrase "same object" > in some way. That would be one of the few limited cases I mentioned :-) I'll grant you that having the concept of "the same object" can be important. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From marko at pacujo.net Fri Jul 7 04:43:04 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 11:43:04 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87shi8zlgn.fsf@elektro.pacujo.net> Steve D'Aprano : > On Fri, 7 Jul 2017 07:10 am, Marko Rauhamaa wrote: > >> I believe identity can be defined much better, in numerous isomorphic >> ways in fact. >> >> For example, we could equate each object with a sequence number >> (unrelated with its id()). You can define that the "None" object is >> in fact the natural number 0. The "False" object is in fact the >> natural number 1 etc for all the primordial objects. During the >> execution of the program, new objects are created, which simply >> associates characteristics to ever higher natural numbers. > > Hmmm... interesting. You might just be on the right track here. That > might even work for "identity" as required by Python. > > Of course you can't say "equate each object with its sequence number" > since that implies that: > > assert None == 0 Python's integer object 0 might be equated with the (mathematical) natural number 18974387634. Python code would have no way of introspecting that natural number. The execution model would determine what properties object 18974387634 would have. > Are you satisfied now? Can we please put this debate to bed? Feel free to stop replying. These kinds of debate keep on going forever because you still don't understand what I'm getting at (and probably vice versa). Marko From steve+python at pearwood.info Fri Jul 7 04:44:20 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 07 Jul 2017 18:44:20 +1000 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> Message-ID: <595f49e6$0$1606$c3e8da3$5496439d@news.astraweb.com> On Fri, 7 Jul 2017 06:12 pm, Gregory Ewing wrote: > Steve D'Aprano wrote: >> That implies that it is impossible to implement Scheme: >> >> - using a programming language where variables and objects may move during >> their lifetime; >> >> - or using a computing device without conventional memory, e.g. implementing >> Scheme using hydraulics, DNA computing, clockwork, or emulated in the human >> brain. > > Only if you interpret the word "address" very narrowly. I don't know about that. Consider emulating a Python interpreter in your own brain by executing in your own head some code that you read. Where in your brain do the objects (the values) live? From everything we know about the brain, they will be distributed across great big swaths of the brain, over billions of neurons, and there is nowhere we could point to and say "this is where our mental model of this object is". The best we could say is that "if we damage such and such an area of the brain, then the victim will be unable to perform the task of executing Python code in his head, therefore we think that the objects are distributed in that area rather than another area". > By the way, even the low-level notion of "address" that C > programs deal with is, on most modern hardware, a *virtual* > address that goes through a level of translation before it > identifies a physical set of transistors, and that mapping > can change as stuff gets paged in and out. So it's already > an abstract concept to some extent. Indeed. But the Python virtual machine doesn't require objects to have an address at all. Spacial thinking is so important for us primates that it is hard for us puny humans to think of values without thinking of them existing in some location, but there's no logical requirement for that to be the case. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From marko at pacujo.net Fri Jul 7 04:51:19 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 11:51:19 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f2d7d$0$1614$c3e8da3$5496439d@news.astraweb.com> <595f4668$0$1608$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87o9swzl2w.fsf@elektro.pacujo.net> Steve D'Aprano : > On Fri, 7 Jul 2017 05:45 pm, Chris Angelico wrote: > >> On Fri, Jul 7, 2017 at 4:43 PM, Steve D'Aprano >> wrote: >>> On Fri, 7 Jul 2017 01:41 am, Marko Rauhamaa wrote: >>>> In Second-Order Logic, you can define identity directly: >>>> >>>> ?x ?y x = y ? ?P (P(x) ? P(y)) >>> >>> Translating to English: >>> >>> For all x, for all y, x equals y if and only if for all P >>> (P(x) if and only if P(y)) >>> >>> [...] >> >> It sounds to me like this has defined equality, not identity, right? > > In mathematics, I believe that equality and identity in this sense are > the same, and we could spell the mathematical operator "=" as "is" > instead. Mathematicians call the principle "extensionality" (). Python programmers call it duck-typing. That's why in set theory, you talk about "the empty set". Any two sets that satisfy the conditions for an empty set are indistinguishable and therefore identical: ?x ?y (?z z ? x ? z ? y) ? x = y Marko From rosuav at gmail.com Fri Jul 7 04:53:40 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 18:53:40 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <87shi8zlgn.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> <87shi8zlgn.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 6:43 PM, Marko Rauhamaa wrote: > Steve D'Aprano : > >> On Fri, 7 Jul 2017 07:10 am, Marko Rauhamaa wrote: >> >>> I believe identity can be defined much better, in numerous isomorphic >>> ways in fact. >>> >>> For example, we could equate each object with a sequence number >>> (unrelated with its id()). You can define that the "None" object is >>> in fact the natural number 0. The "False" object is in fact the >>> natural number 1 etc for all the primordial objects. During the >>> execution of the program, new objects are created, which simply >>> associates characteristics to ever higher natural numbers. >> >> Hmmm... interesting. You might just be on the right track here. That >> might even work for "identity" as required by Python. >> >> Of course you can't say "equate each object with its sequence number" >> since that implies that: >> >> assert None == 0 > > Python's integer object 0 might be equated with the (mathematical) > natural number 18974387634. Python code would have no way of > introspecting that natural number. > > The execution model would determine what properties object 18974387634 > would have. Then what's the point of that number? If you can't see it from Python code, it's not part of the language semantics. ChrisA From marko at pacujo.net Fri Jul 7 04:58:48 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 11:58:48 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> <595f49e6$0$1606$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87k23kzkqf.fsf@elektro.pacujo.net> Steve D'Aprano : > But the Python virtual machine doesn't require objects to have an > address at all. That's an interesting question. Is it possible to define formal semantics for Python without the notion of an address (under some name)? Ultimately it seems necessary to have an enumerable set (address space) that maps to objects. Marko From rosuav at gmail.com Fri Jul 7 05:06:45 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 19:06:45 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <87k23kzkqf.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> <595f49e6$0$1606$c3e8da3$5496439d@news.astraweb.com> <87k23kzkqf.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 6:58 PM, Marko Rauhamaa wrote: > Steve D'Aprano : > >> But the Python virtual machine doesn't require objects to have an >> address at all. > > That's an interesting question. Is it possible to define formal > semantics for Python without the notion of an address (under some name)? > Ultimately it seems necessary to have an enumerable set (address space) > that maps to objects. Yes, it most definitely is. I have explained Python's object model on the kitchen table, using sheets of paper, and pencil lines/arrows representing references. Aside from being tedious, and being vulnerable to an accidental brush of the sleeve, this is a fully compliant Python interpreter. (And when we were done, the entire heap was garbage collected simultaneously.) ChrisA From marko at pacujo.net Fri Jul 7 05:15:43 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 12:15:43 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> <87shi8zlgn.fsf@elektro.pacujo.net> Message-ID: <87fue8zjy8.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 7, 2017 at 6:43 PM, Marko Rauhamaa wrote: >> Python's integer object 0 might be equated with the (mathematical) >> natural number 18974387634. Python code would have no way of >> introspecting that natural number. >> >> The execution model would determine what properties object 18974387634 >> would have. > > Then what's the point of that number? If you can't see it from Python > code, it's not part of the language semantics. Excellent question!!! In fact, it is a very frustrating question. You can only define the semantics of Python (in this case) by providing an *arbitrary* mapping to an imaginary abstract machine. There's no way to define the objective abstraction. Metamathematicians grappled with the same problem a century ago when they tried to define natural numbers. Their promising start collapsed because of the Russel paradox. To their great disappointment, they had to choose an arbitrary set-theoretical model to be the standard: 0 = {} 1 = {0} 2 = {0, 1} 3 = {0, 1, 2} etc In fact, today's mathematicians couldn't care less what natural numbers are. They have captured all relevant characteristics in a number axioms, and those suffice to generate all interesting mathematics. Marko From pavol.lisy at gmail.com Fri Jul 7 06:30:19 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Fri, 7 Jul 2017 12:30:19 +0200 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 7/7/17, Steve D'Aprano wrote: > import sys > if sys.version_info >= (3,): # the comma is important > print("version 3") But be careful inside script! It could live long enough to see python4 :) From marko at pacujo.net Fri Jul 7 07:16:09 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 14:16:09 +0300 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87bmowzedi.fsf@elektro.pacujo.net> Pavol Lisy : > On 7/7/17, Steve D'Aprano wrote: > >> import sys >> if sys.version_info >= (3,): # the comma is important >> print("version 3") > > But be careful inside script! It could live long enough to see python4 > :) That's a serious concern. An application doesn't know about Python's future. What would be needed is: import sys try: sys.require_version((3, 4, 2)) except NotSupportedException: sys.stderr.write("Sorry :(\n") sys.exit(1) Marko From rosuav at gmail.com Fri Jul 7 07:24:14 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 21:24:14 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <87fue8zjy8.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> <87shi8zlgn.fsf@elektro.pacujo.net> <87fue8zjy8.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 7:15 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 7, 2017 at 6:43 PM, Marko Rauhamaa wrote: >>> Python's integer object 0 might be equated with the (mathematical) >>> natural number 18974387634. Python code would have no way of >>> introspecting that natural number. >>> >>> The execution model would determine what properties object 18974387634 >>> would have. >> >> Then what's the point of that number? If you can't see it from Python >> code, it's not part of the language semantics. > > Excellent question!!! > > In fact, it is a very frustrating question. You can only define the > semantics of Python (in this case) by providing an *arbitrary* mapping > to an imaginary abstract machine. There's no way to define the objective > abstraction. So aside from an artificial sense of purity, what's the point in defining object identity *at all*? Why invent an arbitrary number that you can't even see? ChrisA From marko at pacujo.net Fri Jul 7 07:48:14 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 07 Jul 2017 14:48:14 +0300 Subject: About the implementation of del in Python 3 References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> <87shi8zlgn.fsf@elektro.pacujo.net> <87fue8zjy8.fsf@elektro.pacujo.net> Message-ID: <877ezkzcw1.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 7, 2017 at 7:15 PM, Marko Rauhamaa wrote: >> You can only define the semantics of Python (in this case) by >> providing an *arbitrary* mapping to an imaginary abstract machine. >> There's no way to define the objective abstraction. > > So aside from an artificial sense of purity, what's the point in > defining object identity *at all*? Why invent an arbitrary number that > you can't even see? Without such an invisible identity, you can't specify the desired behavior of a Python program. (Well, id() returns a visible identity, which you could equate with the invisible one.) I understand that not everything should be strictly formal, but all attempts at clarifying Python's object system necessarily involve evoking some silly abstract model. The Lisp community is so old they never thought of shunning hardware concepts (storage, pointers, Common Address Register, Common Data Register etc). There doesn't seem to be any better way for Python, either. It might be easiest to say that every Python object has an address and id() returns it. Even if you were lying, nobody would be able to call your bluff. Then, explaining objects to newcomers would be a bit more straightforward. Marko From rosuav at gmail.com Fri Jul 7 07:55:03 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 7 Jul 2017 21:55:03 +1000 Subject: About the implementation of del in Python 3 In-Reply-To: <877ezkzcw1.fsf@elektro.pacujo.net> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <595e44e7$0$1620$c3e8da3$5496439d@news.astraweb.com> <874lup1th8.fsf@elektro.pacujo.net> <595e7dad$0$1600$c3e8da3$5496439d@news.astraweb.com> <874lupp8zn.fsf@elektro.pacujo.net> <595f3571$0$1597$c3e8da3$5496439d@news.astraweb.com> <87shi8zlgn.fsf@elektro.pacujo.net> <87fue8zjy8.fsf@elektro.pacujo.net> <877ezkzcw1.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 7, 2017 at 9:48 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 7, 2017 at 7:15 PM, Marko Rauhamaa wrote: >>> You can only define the semantics of Python (in this case) by >>> providing an *arbitrary* mapping to an imaginary abstract machine. >>> There's no way to define the objective abstraction. >> >> So aside from an artificial sense of purity, what's the point in >> defining object identity *at all*? Why invent an arbitrary number that >> you can't even see? > > Without such an invisible identity, you can't specify the desired > behavior of a Python program. (Well, id() returns a visible identity, > which you could equate with the invisible one.) > > I understand that not everything should be strictly formal, but all > attempts at clarifying Python's object system necessarily involve > evoking some silly abstract model. "x is y" returns True if and only if x and y refer to the same object. You have yet to demonstrate that the above statement is underspecified. ChrisA From grant.b.edwards at gmail.com Fri Jul 7 09:11:32 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Fri, 7 Jul 2017 13:11:32 +0000 (UTC) Subject: Test 0 and false since false is 0 References: Message-ID: On 2017-07-07, Stefan Ram wrote: > Sayth Renshaw writes: >>I have tried or conditions of v == False etc but then the 0's >>being false also aren't moved. How can you check this at >>once? > > ?The Boolean type is a subtype of the integer type, and > Boolean values behave like the values 0 and 1, > respectively, in almost all contexts, the exception > being that when converted to a string, the strings > "False" or "True" are returned, respectively.? > > The Python Language Reference, Release 3.6.0; > 3.2 The standard type hierarchy > > So maybe you can add a test for the type? > >>>> def isfalse( x ): > ... return x == 0 and str( type( x )) == "" > ... What's wrong with the following? x is False Isn't False a singleton value? -- Grant Edwards grant.b.edwards Yow! I want a VEGETARIAN at BURRITO to go ... with gmail.com EXTRA MSG!! From nathan.ernst at gmail.com Fri Jul 7 10:23:58 2017 From: nathan.ernst at gmail.com (Nathan Ernst) Date: Fri, 7 Jul 2017 09:23:58 -0500 Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: You'd be better off using the builtin "isinstance" function, e.g.: isinstance(x, int). This also has the added benefit of working nicely with inheritance (isinstance returns true if the actual type is derived from the classinfo passed as the second argument). See https://docs.python.org/3/library/functions.html#isinstance for details. Regards, Nathan On Fri, Jul 7, 2017 at 2:04 AM, Peter Otten <__peter__ at web.de> wrote: > Sayth Renshaw wrote: > > > I was trying to solve a problem and cannot determine how to filter 0's > but > > not false. > > > > Given a list like this > > ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0,{},0,0,9] > > > > I want to be able to return this list > > ["a","b",None,"c","d",1,False,1,3,[],1,9,{},9,0,0,0,0,0,0,0,0,0,0] > > > > However if I filter like this > > > > def move_zeros(array): > > l1 = [v for v in array if v != 0] > > l2 = [v for v in array if v == 0] > > return l1 + l2 > > > > I get this > > ['a', 'b', None, 'c', 'd', 1, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, False, 0, > 0, > > [0, 0, 0, 0, 0] > > > > I have tried or conditions of v == False etc but then the 0's being false > > also aren't moved. How can you check this at once? > > > > > > Cheers > > > > Sayth > > Another option is to test for type(value) == int: > > >>> before = ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0, > {},0,0,9] > >>> wanted = ["a","b",None,"c","d",1,False,1,3,[],1,9, > {},9,0,0,0,0,0,0,0,0,0,0] > >>> after = sorted(before, key=lambda x: x == 0 and type(x) == int) > >>> assert str(after) == str(wanted) > >>> after > ['a', 'b', None, 'c', 'd', 1, False, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0] > > > That way float values will be left alone, too: > > >>> sorted([0.0, 0, False, [], "x"], key=lambda x: x == 0 and type(x) == > int) > [0.0, False, [], 'x', 0] > > > -- > https://mail.python.org/mailman/listinfo/python-list > From nathan.ernst at gmail.com Fri Jul 7 10:41:00 2017 From: nathan.ernst at gmail.com (Nathan Ernst) Date: Fri, 7 Jul 2017 09:41:00 -0500 Subject: About the implementation of del in Python 3 In-Reply-To: <595f385f$0$3627$426a34cc@news.free.fr> References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f385f$0$3627$426a34cc@news.free.fr> Message-ID: Looks like single expression statements are handled a bit differently than multiple expression statements: Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> n = 4000; m = 4000; n is m True >>> n = 4000 >>> m = 4000 >>> n is m False >>> On Fri, Jul 7, 2017 at 2:29 AM, Dan Wissme wrote: > Le 06/07/2017 ? 20:56, Nathan Ernst a ?crit : > >> In Python, "==" is not a reference equality operator (and I hate Java for >> their misuse of the operator), so I absolutely disagree with using the >> Java >> description to describe Python's "==" operator, primarily because, well, >> it's wrong. Simple example: >> >> With Python 3.5.2 (should hold for any version 2.4 or greater): >> >>> a = 1 >>>>> b = 1 >>>>> a == b >>>>> >>>> True >> >>> a is b >>>>> >>>> True >> >>> c = 1000 >>>>> d = 1000 >>>>> c == d >>>>> >>>> True >> >>> c is d >>>>> >>>> False >> > > Strange behavior in Python 3.6.0 > >>> i = 3000 > >>> j = 3000 > >>> i is j > False > >>> n = 4000 ; m = 4000 ; n is m > True > > dan > > > > -- > https://mail.python.org/mailman/listinfo/python-list > From __peter__ at web.de Fri Jul 7 11:17:48 2017 From: __peter__ at web.de (Peter Otten) Date: Fri, 07 Jul 2017 17:17:48 +0200 Subject: Test 0 and false since false is 0 References: Message-ID: Nathan Ernst wrote: > On Fri, Jul 7, 2017 at 2:04 AM, Peter Otten <__peter__ at web.de> wrote: >> >>> sorted([0.0, 0, False, [], "x"], key=lambda x: x == 0 and type(x) == >> int) >> [0.0, False, [], 'x', 0] > You'd be better off using the builtin "isinstance" function, e.g.: > isinstance(x, int). This also has the added benefit of working nicely with > inheritance (isinstance returns true if the actual type is derived from > the classinfo passed as the second argument). See > https://docs.python.org/3/library/functions.html#isinstance for details. Hm, I suggest that you run my code sample above with your suggested improvement ;) From random832 at fastmail.com Fri Jul 7 11:44:11 2017 From: random832 at fastmail.com (Random832) Date: Fri, 07 Jul 2017 11:44:11 -0400 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> Message-ID: <1499442251.2884859.1033641176.407E4418@webmail.messagingengine.com> On Fri, Jul 7, 2017, at 04:12, Gregory Ewing wrote: > Only if you interpret the word "address" very narrowly. > > By the way, even the low-level notion of "address" that C > programs deal with is, on most modern hardware, a *virtual* > address that goes through a level of translation before it > identifies a physical set of transistors, and that mapping > can change as stuff gets paged in and out. So it's already > an abstract concept to some extent. What's not abstract is that if an object has address X and is N bytes long, those bytes (and any larger subobjects) occupy a contiguous range of addresses between X and X+(N-1). This is not required to be true of Python IDs. From python.list at tim.thechases.com Fri Jul 7 11:57:00 2017 From: python.list at tim.thechases.com (Tim Chase) Date: Fri, 7 Jul 2017 10:57:00 -0500 Subject: how to get partition information of a hard disk with python In-Reply-To: <1f26708f-f921-4377-9cfc-907f42c13f0a@googlegroups.com> References: <1f26708f-f921-4377-9cfc-907f42c13f0a@googlegroups.com> Message-ID: <20170707105700.14731cbc@bigbox.christie.dr> Strange. The OP's message didn't make it here, but I'm seeing multiple replies > On Wednesday, September 22, 2010 at 4:01:04 AM UTC+5:30, Hellmut > Weber wrote: > > Hi list, > > I'm looking for a possibility to access the partiton inforamtion > > of a hard disk of my computer from within a python program. You don't specify whether your disk has MBR, GPT, or some other partitioning scheme. However, at least for MBR, I threw together this code a while back: https://mail.python.org/pipermail/python-list/2009-November/559546.html I imagine something similar could be done in the case of a GPT. -tkc From random832 at fastmail.com Fri Jul 7 12:14:26 2017 From: random832 at fastmail.com (Random832) Date: Fri, 07 Jul 2017 12:14:26 -0400 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f385f$0$3627$426a34cc@news.free.fr> Message-ID: <1499444066.2892761.1033652952.2AFBE3B6@webmail.messagingengine.com> On Fri, Jul 7, 2017, at 10:41, Nathan Ernst wrote: > Looks like single expression statements are handled a bit differently > than > multiple expression statements: > > Python 3.5.2 (default, Nov 17 2016, 17:05:23) > [GCC 5.4.0 20160609] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> n = 4000; m = 4000; n is m > True > >>> n = 4000 > >>> m = 4000 > >>> n is m > False Equal constants are combined (into a single object in the code object's constant list - the compiler actually uses a dictionary internally) at compile time, when you execute two separate lines in the interactive interpreter they are compiled separately. I would call your first case multiple statements on a line, not multiple expressions in one statement. But what's important is the separate lines of the interactive interpreter and therefore separate compile contexts - if you were to do exec('n=4000\nm=4000') you would get a single object. With a sufficiently small number (between 0 and 256 inclusive), they are globally cached and you will get the same object even across separate compilations, because the compiler gets the same object when it parses "10" in the first place. Imagine more or less, this process: d = {} a = int("4000") if a in d: i = d[a] else: i = d[a] = len(d) b = int("4000") if b in d: j =d[b] else: j = d[b] = len(d) c = d.keys() n = c[i] m = c[j] When you do it on two separate prompts you get: d = {} a = int("4000") if a in d: i = d[a] else: i = d[a] = len(d) c = d.keys() n = c[i] d = {} a = int("4000") if a in d: i = d[a] else: i = d[a] = len(d) c = d.keys() m = c[i] From ian.g.kelly at gmail.com Fri Jul 7 12:23:24 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 7 Jul 2017 10:23:24 -0600 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> Message-ID: On Thu, Jul 6, 2017 at 12:56 PM, Nathan Ernst wrote: > In Python, "==" is not a reference equality operator (and I hate Java for > their misuse of the operator), so I absolutely disagree with using the Java > description to describe Python's "==" operator, primarily because, well, > it's wrong. Simple example: I thought it went without saying that the direct quote that I posted would need some revision before it would apply to Python. The parts about "objects or arrays" would also need to be changed, since Python doesn't have Java-like arrays. From ian.g.kelly at gmail.com Fri Jul 7 12:28:03 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 7 Jul 2017 10:28:03 -0600 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <595f385f$0$3627$426a34cc@news.free.fr> Message-ID: On Fri, Jul 7, 2017 at 8:41 AM, Nathan Ernst wrote: > Looks like single expression statements are handled a bit differently than > multiple expression statements: > > Python 3.5.2 (default, Nov 17 2016, 17:05:23) > [GCC 5.4.0 20160609] on linux > Type "help", "copyright", "credits" or "license" for more information. >>>> n = 4000; m = 4000; n is m > True >>>> n = 4000 >>>> m = 4000 >>>> n is m > False It actually has to do with units of compilation. In the first example, all three statements are compiled together and the constants are optimized to the same object. In the second example, because you're testing this in the REPL they're compiled at different times and since 4000 is larger than what CPython will intern, you end up with different objects. If however, you take the second example and gather it into a function, you'll get the compile-time optimization again: >>> def f(): ... n = 4000 ... m = 4000 ... return n is m ... >>> f() True From eryksun at gmail.com Fri Jul 7 14:57:26 2017 From: eryksun at gmail.com (eryk sun) Date: Fri, 7 Jul 2017 18:57:26 +0000 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 7, 2017 at 7:53 AM, Steve D'Aprano wrote: > On Fri, 7 Jul 2017 04:30 pm, Ben S. wrote: > >> Is there a way to execute a python script with v3 python engine in v2 >> compatibility mode? I am thinking about a command parameter like (python.exe >> is v3.*): >> >> python.exe -execute_as_v2 myscript.py > > No. Python 3 is always Python 3, and Python 2 is always Python 2. But what you > can do is install both, and then call > > python2.exe myscript.py > > python3.exe anotherscript.py Windows Python installs two loaders for each version of Python: python.exe and pythonw.exe. No links or copies are created for pythonX[w].exe, pythonX.Y[w].exe, or pythonX.Y-32[w].exe. Instead, there are separate py.exe and pyw.exe launchers that use the registry to find and execute a loader for a given version, e.g. py -2 myscript.py py -3.6-32 anotherscript.py As of 3.6, the py launcher defaults to the highest version of Python 3 that's installed. 64-bit Python is preferred on 64-bit Windows. The default version can be overridden by setting the PY_PYTHON environment variable. That said, you don't have to manually run a script as an argument of py.exe or python.exe. For a default Python 3 installation, if the PATHEXT environment variable contains ".PY", then you can run "script.py" as script arg1 ... argN in CMD or PowerShell. If a script has a Unix shebang, the launcher will read it to run the required version of Python, if that version is installed. From formisc at gmail.com Fri Jul 7 16:00:17 2017 From: formisc at gmail.com (Andrew Z) Date: Fri, 7 Jul 2017 13:00:17 -0700 (PDT) Subject: Python3 : import Message-ID: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> this has bee driving me nutz for the past few hours. 2 modules are in the same directory. I want to be able to use them both: [code] [az at hp tst1]$ pwd /home/az/Dropbox/work/Prjs/tst1 [az at hp tst1]$ ls -l total 16 -rw-rw-r--. 1 az az 66 Jul 7 12:58 db.py -rw-rw-r--. 1 az az 182 Jul 7 15:54 uno.py [az at hp tst1]$ [az at hp tst1]$ [az at hp tst1]$ cat ./db.py class DB(): def __init__(self): print("I'm DB") [az at hp tst1]$ cat ./uno.py from . import db class Uno(): def __init__(self): print("I'm uno") self.db = db.DB() def printing(self): print("Uno.printing DB") if __name__ == '__main__': uno = Uno() [az at hp tst1]$ [az at hp tst1]$ [az at hp tst1]$ python3 ./uno.py Traceback (most recent call last): File "./uno.py", line 1, in from . import db SystemError: Parent module '' not loaded, cannot perform relative import [/code] Much obliged. From formisc at gmail.com Fri Jul 7 16:03:39 2017 From: formisc at gmail.com (Andrew Z) Date: Fri, 7 Jul 2017 13:03:39 -0700 (PDT) Subject: Python3 : import In-Reply-To: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> References: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> Message-ID: <05d39b6b-d88c-4e09-98d3-413113645fec@googlegroups.com> On Friday, July 7, 2017 at 4:00:51 PM UTC-4, Andrew Z wrote: > this has bee driving me nutz for the past few hours. > 2 modules are in the same directory. I want to be able to use them both: > > [code] > > [az at hp tst1]$ pwd > /home/az/Dropbox/work/Prjs/tst1 > > [az at hp tst1]$ ls -l > total 16 > -rw-rw-r--. 1 az az 66 Jul 7 12:58 db.py > -rw-rw-r--. 1 az az 182 Jul 7 15:54 uno.py > [az at hp tst1]$ > [az at hp tst1]$ > [az at hp tst1]$ cat ./db.py > > class DB(): > def __init__(self): > print("I'm DB") > > [az at hp tst1]$ cat ./uno.py > from . import db > > class Uno(): > def __init__(self): > print("I'm uno") > self.db = db.DB() > > def printing(self): > print("Uno.printing DB") > > > if __name__ == '__main__': > uno = Uno() > > [az at hp tst1]$ > [az at hp tst1]$ > [az at hp tst1]$ python3 ./uno.py > Traceback (most recent call last): > File "./uno.py", line 1, in > from . import db > SystemError: Parent module '' not loaded, cannot perform relative import > > [/code] > > > Much obliged. Variations on the subject with the same sad results: [code] from .db import DB [/code] From formisc at gmail.com Fri Jul 7 16:12:46 2017 From: formisc at gmail.com (Andrew Z) Date: Fri, 7 Jul 2017 13:12:46 -0700 (PDT) Subject: Python3 : import In-Reply-To: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> References: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> Message-ID: On Friday, July 7, 2017 at 4:00:51 PM UTC-4, Andrew Z wrote: > this has bee driving me nutz for the past few hours. > 2 modules are in the same directory. I want to be able to use them both: > > [code] > > [az at hp tst1]$ pwd > /home/az/Dropbox/work/Prjs/tst1 > > [az at hp tst1]$ ls -l > total 16 > -rw-rw-r--. 1 az az 66 Jul 7 12:58 db.py > -rw-rw-r--. 1 az az 182 Jul 7 15:54 uno.py > [az at hp tst1]$ > [az at hp tst1]$ > [az at hp tst1]$ cat ./db.py > > class DB(): > def __init__(self): > print("I'm DB") > > [az at hp tst1]$ cat ./uno.py > from . import db > > class Uno(): > def __init__(self): > print("I'm uno") > self.db = db.DB() > > def printing(self): > print("Uno.printing DB") > > > if __name__ == '__main__': > uno = Uno() > > [az at hp tst1]$ > [az at hp tst1]$ > [az at hp tst1]$ python3 ./uno.py > Traceback (most recent call last): > File "./uno.py", line 1, in > from . import db > SystemError: Parent module '' not loaded, cannot perform relative import > > [/code] > > > Much obliged. [az at hp tst1]$ python3 --version Python 3.5.3 From ian.g.kelly at gmail.com Fri Jul 7 16:15:36 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 7 Jul 2017 14:15:36 -0600 Subject: Python3 : import In-Reply-To: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> References: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> Message-ID: On Fri, Jul 7, 2017 at 2:00 PM, Andrew Z wrote: > [az at hp tst1]$ python3 ./uno.py > Traceback (most recent call last): > File "./uno.py", line 1, in > from . import db > SystemError: Parent module '' not loaded, cannot perform relative import That error message is a bit confusing, but relative imports are relative to packages, not directories. If the module is not part of a package then it can't do a relative import. You can use an absolute import, though: import db From formisc at gmail.com Fri Jul 7 16:37:51 2017 From: formisc at gmail.com (Andrew Z) Date: Fri, 7 Jul 2017 13:37:51 -0700 (PDT) Subject: Python3 : import In-Reply-To: References: <80bf01dc-c8ac-4f0a-88a8-2a839e04823b@googlegroups.com> Message-ID: On Friday, July 7, 2017 at 4:16:38 PM UTC-4, Ian wrote: > On Fri, Jul 7, 2017 at 2:00 PM, Andrew Z wrote: > > [az at hp tst1]$ python3 ./uno.py > > Traceback (most recent call last): > > File "./uno.py", line 1, in > > from . import db > > SystemError: Parent module '' not loaded, cannot perform relative import > > That error message is a bit confusing, but relative imports are > relative to packages, not directories. If the module is not part of a > package then it can't do a relative import. You can use an absolute > import, though: > > import db Thank you Ian. that's right on the money. All works now. I was missing the " ..relative imports are relative to packages, not directories. If the module is not part of a package then it can't do a relative import." thank you! From andrew.pennebaker at gmail.com Fri Jul 7 17:45:50 2017 From: andrew.pennebaker at gmail.com (Andrew Pennebaker) Date: Fri, 7 Jul 2017 14:45:50 -0700 (PDT) Subject: Windows: python3.exe missing Message-ID: Could the Windows installer for Python 3 provide a "python3" command, such as a python3.bat or python3.exe file, to help with scripts that rely on the interpreter being called "python3"? The py launcher is somewhat helpful, but a proper python3 runnable is preferable. From sohcahtoa82 at gmail.com Fri Jul 7 17:49:11 2017 From: sohcahtoa82 at gmail.com (sohcahtoa82 at gmail.com) Date: Fri, 7 Jul 2017 14:49:11 -0700 (PDT) Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: <7ec891e5-1848-4d9a-8e29-f6290462923f@googlegroups.com> On Friday, July 7, 2017 at 11:58:33 AM UTC-7, eryk sun wrote: > On Fri, Jul 7, 2017 at 7:53 AM, Steve D'Aprano > wrote: > > On Fri, 7 Jul 2017 04:30 pm, Ben S. wrote: > > > >> Is there a way to execute a python script with v3 python engine in v2 > >> compatibility mode? I am thinking about a command parameter like (python.exe > >> is v3.*): > >> > >> python.exe -execute_as_v2 myscript.py > > > > No. Python 3 is always Python 3, and Python 2 is always Python 2. But what you > > can do is install both, and then call > > > > python2.exe myscript.py > > > > python3.exe anotherscript.py > > Windows Python installs two loaders for each version of Python: > python.exe and pythonw.exe. No links or copies are created for > pythonX[w].exe, pythonX.Y[w].exe, or pythonX.Y-32[w].exe. Instead, > there are separate py.exe and pyw.exe launchers that use the registry > to find and execute a loader for a given version, e.g. > > py -2 myscript.py > py -3.6-32 anotherscript.py > > As of 3.6, the py launcher defaults to the highest version of Python 3 > that's installed. 64-bit Python is preferred on 64-bit Windows. The > default version can be overridden by setting the PY_PYTHON environment > variable. > > That said, you don't have to manually run a script as an argument of > py.exe or python.exe. For a default Python 3 installation, if the > PATHEXT environment variable contains ".PY", then you can run > "script.py" as > > script arg1 ... argN > > in CMD or PowerShell. If a script has a Unix shebang, the launcher > will read it to run the required version of Python, if that version is > installed. Is there any particular reason the Windows python does it that way? Certainly it wouldn't be too difficult to include a "python2.exe" and "python3.exe", even as symbolic links. From rantingrickjohnson at gmail.com Fri Jul 7 19:23:13 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Fri, 7 Jul 2017 16:23:13 -0700 (PDT) Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: <2011a463-4e4d-441b-b0c7-27a8d8ae166f@googlegroups.com> On Friday, July 7, 2017 at 2:54:04 AM UTC-5, Steve D'Aprano wrote: > [...] That's now not only backwards compatible, but it is > forward compatible: if Python changes in the future to > bring reduce back into the built-in functions, your code > will automatically keep working. If python starts going all paranoid schizoid on us, the few remaining holdovers from the good old days will be choosing another language. But don't worry Steven, even if i find a new favorite language, i will always drop in preiodically to say hi. ;-) > > Additional question: Is there a way to execute a python > > script with v3 python engine in v2 compatibility mode? Nope. Even the though the most hated software developer in history offers a compatibility mode for their users, Python does not. Python's philosophy is that you will accept Python3, or you will suffer... As for myself, i'm waiting for Python4. I will skip right over Python3. I say, let the other saps to suffer the migration headaches, because i don't need them. I'm too busy getting stuff done. From ian.g.kelly at gmail.com Fri Jul 7 20:57:20 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Fri, 7 Jul 2017 18:57:20 -0600 Subject: Check Python version from inside script? Run Pythons script in v2 compatibility mode? In-Reply-To: <7ec891e5-1848-4d9a-8e29-f6290462923f@googlegroups.com> References: <1c6fdc1a-2363-4edf-9020-87d05beb4964@googlegroups.com> <595f3e03$0$1589$c3e8da3$5496439d@news.astraweb.com> <7ec891e5-1848-4d9a-8e29-f6290462923f@googlegroups.com> Message-ID: On Fri, Jul 7, 2017 at 3:49 PM, wrote: > Is there any particular reason the Windows python does it that way? Certainly it wouldn't be too difficult to include a "python2.exe" and "python3.exe", even as symbolic links. Windows associates file types with applications by extension. When you double click on "holygrail.py" should it open with python2.exe or python3.exe? In Unix this is solved with a shebang, so the main purpose of py.exe is to do the same: it's the standard open handler for .py files and it delegates out the actual execution to the appropriate Python version by examining the file for version hints. From greg.ewing at canterbury.ac.nz Fri Jul 7 21:08:42 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Sat, 08 Jul 2017 13:08:42 +1200 Subject: About the implementation of del in Python 3 In-Reply-To: References: <595de1eb$0$4818$426a74cc@news.free.fr> <87r2xt2a06.fsf@elektro.pacujo.net> <87zichzi6i.fsf@elektro.pacujo.net> <87fue9pis6.fsf@elektro.pacujo.net> <595f2f06$0$1619$c3e8da3$5496439d@news.astraweb.com> <1499442251.2884859.1033641176.407E4418@webmail.messagingengine.com> Message-ID: Random832 wrote: > What's not abstract is that if an object has address X and is N bytes > long, those bytes (and any larger subobjects) occupy a contiguous range > of addresses between X and X+(N-1). If you're talking about Python objects, that's not necessarily true -- there's no requirement that a Python object occupy a contiguous range of machine addresses. In CPython, many don't, e.g. strings. Even for those that do, it's a range of *virtual* addresses, which doeesn't necessarily correspond to contiguous physical addresses, e.g. if it crosses a page boundary. -- Greg From tjreedy at udel.edu Fri Jul 7 23:25:39 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 7 Jul 2017 23:25:39 -0400 Subject: Windows: python3.exe missing In-Reply-To: References: Message-ID: On 7/7/2017 5:45 PM, Andrew Pennebaker wrote: > Could the Windows installer for Python 3 provide a "python3" command, such as a python3.bat or python3.exe file, to help with scripts that rely on the interpreter being called "python3"? > > The py launcher is somewhat helpful, but a proper python3 runnable is preferable. Not when one has multiple python 3 installations. -- Terry Jan Reedy From bill at baddogconsulting.com Fri Jul 7 23:37:50 2017 From: bill at baddogconsulting.com (Bill Deegan) Date: Fri, 7 Jul 2017 23:37:50 -0400 Subject: Windows: python3.exe missing In-Reply-To: References: Message-ID: py -3.5 py -3.6 works. Don't know about py -3.6.0 py -3.6.1 On Fri, Jul 7, 2017 at 11:25 PM, Terry Reedy wrote: > On 7/7/2017 5:45 PM, Andrew Pennebaker wrote: > >> Could the Windows installer for Python 3 provide a "python3" command, >> such as a python3.bat or python3.exe file, to help with scripts that rely >> on the interpreter being called "python3"? >> >> The py launcher is somewhat helpful, but a proper python3 runnable is >> preferable. >> > > Not when one has multiple python 3 installations. > > > -- > Terry Jan Reedy > > -- > https://mail.python.org/mailman/listinfo/python-list > From nad at python.org Sat Jul 8 01:22:32 2017 From: nad at python.org (Ned Deily) Date: Sat, 8 Jul 2017 01:22:32 -0400 Subject: [RELEASE] Python 3.6.2rc2 is now available for testing Message-ID: <4B22E08B-9C6C-47F6-B908-CC7676D43B77@python.org> On behalf of the Python development community and the Python 3.6 release team, I would like to announce the availability of Python 3.6.2rc2. 3.6.2rc2 is the second release candidate for Python 3.6.2, the next maintenance release of Python 3.6. 3.6.2rc2 includes fixes for three security-related issues resolved since the previous release candidate; see the change log (link below). While 3.6.2rc2 is a preview release and, thus, not intended for production environments, we encourage you to explore it and provide feedback via the Python bug tracker (https://bugs.python.org). Please see "What?s New In Python 3.6" for more information: https://docs.python.org/3.6/whatsnew/3.6.html You can find Python 3.6.2rc2 here: https://www.python.org/downloads/release/python-362rc2/ and its change log here: https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-2-release-candidate-2 3.6.2 is now planned for final release on 2017-07-17 with the next maintenance release expected to follow in about 3 months. More information about the 3.6 release schedule can be found here: https://www.python.org/dev/peps/pep-0494/ -- Ned Deily nad at python.org -- [] From blahBlah at blah.org Sat Jul 8 04:48:52 2017 From: blahBlah at blah.org (Sm Chats) Date: Sat, 8 Jul 2017 08:48:52 +0000 (UTC) Subject: poplib's POP3_SSL not downloading latest email Message-ID: I have a small script which checks whether a mail sent by me to myself(i.e delivered to myself) has the same body as the sent message. The problem is that the sent message (sent by SMTP) isn't received by the POP3_SSL object I'm using. It shows mails from more than a month ago. I'm unable to figure out what's wrong. Please help me in correcting the code. Thanks in advance. ???? Here's the script: https://github.com/schedutron/CPAP/blob/master/Chap3/myMail.py Cheers, Sam From blahBlah at blah.org Sat Jul 8 04:57:33 2017 From: blahBlah at blah.org (Sam Chats) Date: Sat, 8 Jul 2017 08:57:33 +0000 (UTC) Subject: poplib's POP3_SSL not downloading latest email References: Message-ID: On Sat, 8 Jul 2017 08:48:52 +0000 (UTC), Sm Chats wrote: > I have a small script which checks whether a mail sent by me to myself(i.e delivered to myself) has the same body as the sent message. The problem is that the sent message (sent by SMTP) isn't received by the POP3_SSL object I'm using. It shows mails from more than a month ago. I'm unable to figure out what's wrong. Please help me in correcting the code. > Thanks in advance. ??\x9f\x99\x8f > > Here's the script: https://github.com/schedutron/CPAP/blob/master/Chap3/myMail.py > > Cheers, > Sam And yes, POP is enabled for all mail in my gmail settings. Cheers, Sam From steve+python at pearwood.info Sat Jul 8 06:08:12 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 08 Jul 2017 20:08:12 +1000 Subject: poplib's POP3_SSL not downloading latest email References: Message-ID: <5960af0e$0$1597$c3e8da3$5496439d@news.astraweb.com> On Sat, 8 Jul 2017 06:48 pm, Sm Chats wrote: > I have a small script which checks whether a mail sent by me to myself(i.e > delivered to myself) has the same body as the sent message. The problem is > that the sent message (sent by SMTP) isn't received by the POP3_SSL object I'm > using. It shows mails from more than a month ago. I'm unable to figure out > what's wrong. Please help me in correcting the code. Thanks in advance. ? > > Here's the script: > https://github.com/schedutron/CPAP/blob/master/Chap3/myMail.py Are you sure the mail is delivered? You wait five seconds for the mail to be delivered, but email can take up to 48 hours (by default) before delivery times out. Maybe it hasn't arrived yet. Or maybe it is being blocked as spam. If you are running this script dozens of times, Gmail might think you are spamming and block delivery. You should separate the script which sends a new message from the one that retrieves the message, so you aren't spamming your account and making Gmail mad. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From blahBlah at blah.org Sat Jul 8 07:54:23 2017 From: blahBlah at blah.org (Sam Chats) Date: Sat, 8 Jul 2017 11:54:23 +0000 (UTC) Subject: poplib's POP3_SSL not downloading latest email References: <5960af0e$0$1597$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Sat, 08 Jul 2017 20:08:12 +1000, Steve D'Aprano wrote: > On Sat, 8 Jul 2017 06:48 pm, Sm Chats wrote: > > > I have a small script which checks whether a mail sent by me to myself(i.e > > delivered to myself) has the same body as the sent message. The problem is > > that the sent message (sent by SMTP) isn't received by the POP3_SSL object I'm > > using. It shows mails from more than a month ago. I'm unable to figure out > > what's wrong. Please help me in correcting the code. Thanks in advance. ??\x9f\x99\x8f > > > > Here's the script: > > https://github.com/schedutron/CPAP/blob/master/Chap3/myMail.py > > Are you sure the mail is delivered? > > You wait five seconds for the mail to be delivered, but email can take up to 48 > hours (by default) before delivery times out. Maybe it hasn't arrived yet. > > Or maybe it is being blocked as spam. If you are running this script dozens of > times, Gmail might think you are spamming and block delivery. You should > separate the script which sends a new message from the one that retrieves the > message, so you aren't spamming your account and making Gmail mad. > > > > > -- > Steve > ??\x80\x9cCheer up,??\x80\x9d they said, ??\x80\x9cthings could be worse.??\x80\x9d So I cheered up, and sure > enough, things got worse. > Thanks, but I can see my sent mails on the Gmail app on my phone. So the emails are indeed getting delivered. The problem is that POP shows me emails from about a month ago, and I've received potentially hundreds of emails after that. So it's a receiving problem. Cheers, Sam From phyllisaaa0514 at gmail.com Sat Jul 8 18:37:55 2017 From: phyllisaaa0514 at gmail.com (phyllisaaa0514 at gmail.com) Date: Sat, 8 Jul 2017 15:37:55 -0700 (PDT) Subject: Case Solution: Federal Bank Dividend Discount Valuation by Debasish Maitra, Varun Dawar In-Reply-To: <674a561f-2757-4150-ae7c-b63c49920c2c@googlegroups.com> References: <674a561f-2757-4150-ae7c-b63c49920c2c@googlegroups.com> Message-ID: <5efdab58-d4d5-471f-9c64-b469090b6f9d@googlegroups.com> On Friday, 7 July 2017 20:16:42 UTC-7, Case Solution & Analysis wrote: > Case Solution and Analysis of Federal Bank: Dividend Discount Valuation by Debasish Maitra, Varun Dawar is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com > > Case Study ID: 9B17N005 / W17123 > > Get Case Study Solution and Analysis of Federal Bank: Dividend Discount Valuation in a FAIR PRICE!! > > Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . > > YOU MUST WRITE FOLLOWING WHILE PLACING YOUR ORDER: > Complete Case Study Name > Authors > Case Study ID > Publisher of Case Study > Your Requirements / Case Questions > > Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From flebber.crue at gmail.com Sat Jul 8 19:25:16 2017 From: flebber.crue at gmail.com (Sayth Renshaw) Date: Sat, 8 Jul 2017 16:25:16 -0700 (PDT) Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: <14ad47d9-630c-4722-b68c-4dabb17350cd@googlegroups.com> > Another option is to test for type(value) == int: > > >>> before = ["a",0,0,"b",None,"c","d",0,1,False,0,1,0,3,[],0,1,9,0,0, > {},0,0,9] > >>> wanted = ["a","b",None,"c","d",1,False,1,3,[],1,9, > {},9,0,0,0,0,0,0,0,0,0,0] > >>> after = sorted(before, key=lambda x: x == 0 and type(x) == int) > >>> assert str(after) == str(wanted) > >>> after > ['a', 'b', None, 'c', 'd', 1, False, 1, 3, [], 1, 9, {}, 9, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0] > > > That way float values will be left alone, too: > > >>> sorted([0.0, 0, False, [], "x"], key=lambda x: x == 0 and type(x) == > int) > [0.0, False, [], 'x', 0] I have been reading this solution > >>> after = sorted(before, key=lambda x: x == 0 and type(x) == int) it is really good, however I don't understand it enough to reimplement something like that myself yet. Though I can that lambda tests for 0 that is equal to an int why does sorted put them to the end? Cheers Sayth From pderocco at ix.netcom.com Sat Jul 8 22:36:18 2017 From: pderocco at ix.netcom.com (Paul D. DeRocco) Date: Sat, 8 Jul 2017 19:36:18 -0700 Subject: Test 0 and false since false is 0 In-Reply-To: <14ad47d9-630c-4722-b68c-4dabb17350cd@googlegroups.com> References: <14ad47d9-630c-4722-b68c-4dabb17350cd@googlegroups.com> Message-ID: > From: Sayth Renshaw > > I have been reading this solution > > >>> after = sorted(before, key=lambda x: x == 0 and type(x) == int) > > it is really good, however I don't understand it enough to > reimplement something like that myself yet. > > Though I can that lambda tests for 0 that is equal to an int > why does sorted put them to the end? Because the expression "x == 0 and type(x) == int" has a value of either False or True, and it sorts all the False values before the True values, leaving the order within those sets unchanged. That said, "x is 0" is even simpler. -- Ciao, Paul D. DeRocco Paul mailto:pderocco at ix.netcom.com From timetowalk11 at gmail.com Sun Jul 9 19:39:44 2017 From: timetowalk11 at gmail.com (timetowalk11 at gmail.com) Date: Sun, 9 Jul 2017 16:39:44 -0700 (PDT) Subject: What's with all of the Case Solution and Test Bank nonsense posts? Message-ID: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. What's with all of the Case Solution and Test Bank nonsense posts? Is is possible to have these posts filtered out? From skip.montanaro at gmail.com Sun Jul 9 19:52:56 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Sun, 9 Jul 2017 18:52:56 -0500 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: > I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > > What's with all of the Case Solution and Test Bank nonsense posts? > Is is possible to have these posts filtered out? You might be able to set up filters in Google Groups, but nobody on the mailing list side of things can do anything on the Usenet side of things. Skip From torriem at gmail.com Sun Jul 9 19:55:49 2017 From: torriem at gmail.com (Michael Torrie) Date: Sun, 9 Jul 2017 17:55:49 -0600 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> On 07/09/2017 05:39 PM, timetowalk11 at gmail.com wrote: > I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > > What's with all of the Case Solution and Test Bank nonsense posts? > Is is possible to have these posts filtered out? I'm sure Google could filter them if it chose. Behind the group, though, is the Usenet newsgroup, which is unmoderated and decentralized, and you can't filter there. That's why most people read Usenet with a good news reader app that can apply filtering and kill rules for you. And of course the mailing list side of things does filter out messages and kills certain points. Probably your cleanest experience will come through the mailing list. Use an email folder and a filter to place all list serv emails in a folder and you'll get a good experience. You can even do this with Gmail. My gmail account has all my list serv messages sorted out on the Gmail server into labels, and I have a special rule to killfile messages from some sources. From timetowalk11 at gmail.com Sun Jul 9 20:59:20 2017 From: timetowalk11 at gmail.com (timetowalk) Date: Sun, 9 Jul 2017 17:59:20 -0700 (PDT) Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> Message-ID: On Sunday, July 9, 2017 at 8:05:59 PM UTC-4, Michael Torrie wrote: > On 07/09/2017 05:39 PM, timetowalk11 at gmail.com wrote: > > I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > > > > What's with all of the Case Solution and Test Bank nonsense posts? > > Is is possible to have these posts filtered out? > > I'm sure Google could filter them if it chose. Behind the group, > though, is the Usenet newsgroup, which is unmoderated and decentralized, > and you can't filter there. That's why most people read Usenet with a > good news reader app that can apply filtering and kill rules for you. > > And of course the mailing list side of things does filter out messages > and kills certain points. Probably your cleanest experience will come > through the mailing list. Use an email folder and a filter to place all > list serv emails in a folder and you'll get a good experience. You can > even do this with Gmail. My gmail account has all my list serv messages > sorted out on the Gmail server into labels, and I have a special rule to > killfile messages from some sources. I will need to read about filtering messages locally. Can the admin simply ban the user? From voteswithfeet at gmail.com Sun Jul 9 21:41:27 2017 From: voteswithfeet at gmail.com (voteswithfeet at gmail.com) Date: Sun, 9 Jul 2017 18:41:27 -0700 (PDT) Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> Message-ID: <2602b0bd-f357-4973-b3dc-12e63cf66e7d@googlegroups.com> On Sunday, July 9, 2017 at 7:59:45 PM UTC-5, timetowalk wrote: > On Sunday, July 9, 2017 at 8:05:59 PM UTC-4, Michael Torrie wrote: > > On 07/09/2017 05:39 PM, timetowalk11 at gmail.com wrote: > > > I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > > > > > > What's with all of the Case Solution and Test Bank nonsense posts? > > > Is is possible to have these posts filtered out? > > > > I'm sure Google could filter them if it chose. Behind the group, > > though, is the Usenet newsgroup, which is unmoderated and decentralized, > > and you can't filter there. That's why most people read Usenet with a > > good news reader app that can apply filtering and kill rules for you. > > > > And of course the mailing list side of things does filter out messages > > and kills certain points. Probably your cleanest experience will come > > through the mailing list. Use an email folder and a filter to place all > > list serv emails in a folder and you'll get a good experience. You can > > even do this with Gmail. My gmail account has all my list serv messages > > sorted out on the Gmail server into labels, and I have a special rule to > > killfile messages from some sources. > > I will need to read about filtering messages locally. > Can the admin simply ban the user? There is no admin, you idiot. This is Usenet! From torriem at gmail.com Sun Jul 9 21:42:10 2017 From: torriem at gmail.com (Michael Torrie) Date: Sun, 9 Jul 2017 19:42:10 -0600 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> Message-ID: On 07/09/2017 06:59 PM, timetowalk wrote: > I will need to read about filtering messages locally. > Can the admin simply ban the user? Usenet[1] doesn't have an admin. It's generally unmoderated and decentralized (no central server). Anyone can run a Usenet server and contribute messages to the global group. In the old days every Uni had it's own Usenet server and we had a lot of great global communication on the groups. However now I think the Python community would be well served to just sever the Usenet connection entirely and just use the mailing list. The mailing list does ban people from posting from time to time. [1] https://en.wikipedia.org/wiki/Usenet From steve+python at pearwood.info Sun Jul 9 21:52:54 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Mon, 10 Jul 2017 11:52:54 +1000 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> <2602b0bd-f357-4973-b3dc-12e63cf66e7d@googlegroups.com> Message-ID: <5962ddf7$0$1601$c3e8da3$5496439d@news.astraweb.com> On Mon, 10 Jul 2017 11:41 am, voteswithfeet at gmail.com wrote: > On Sunday, July 9, 2017 at 7:59:45 PM UTC-5, timetowalk wrote: >> I will need to read about filtering messages locally. >> Can the admin simply ban the user? > > There is no admin, you idiot. This is Usenet! That's unnecessarily rude, and factually incorrect. This is also a mailing list where there is an administrator. The mailing list came first, and was mirrored to the Usenet group. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From voteswithfeet at gmail.com Sun Jul 9 22:37:06 2017 From: voteswithfeet at gmail.com (voteswithfeet at gmail.com) Date: Sun, 9 Jul 2017 19:37:06 -0700 (PDT) Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: <5962ddf7$0$1601$c3e8da3$5496439d@news.astraweb.com> References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> <2602b0bd-f357-4973-b3dc-12e63cf66e7d@googlegroups.com> <5962ddf7$0$1601$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Sunday, July 9, 2017 at 8:53:18 PM UTC-5, Steve D'Aprano wrote: > On Mon, 10 Jul 2017 11:41 am, voteswithfeet at gmail.com wrote: > > > On Sunday, July 9, 2017 at 7:59:45 PM UTC-5, timetowalk wrote: > > >> I will need to read about filtering messages locally. > >> Can the admin simply ban the user? > > > > There is no admin, you idiot. This is Usenet! > > That's unnecessarily rude, I only meant it as friendly banter. But you're right. Next time I should use a less offensive word like twit. From jblack at nopam.com Sun Jul 9 22:47:58 2017 From: jblack at nopam.com (John Black) Date: Sun, 9 Jul 2017 21:47:58 -0500 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: In article <477bde19-0653-4e41-a717-0efe90ac5756 at googlegroups.com>, timetowalk11 at gmail.com says... > > I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > > What's with all of the Case Solution and Test Bank nonsense posts? > Is is possible to have these posts filtered out? Yes it very easy to filter these out with most usenet readers. Use one that lets you setup rules for what gets automatically discarded. I've added these rules to my discard list: >From contains "Case Solution" Or >From contains "Test Banks" Poof. All gone. While you're at it, throw these rules in and the group will appear very clean and on topic. Subject contains "PEDOFILO" Or Subject contains "MAI" Or Subject contains "SEGRETO" Or Subject contains "SETTA" Or Subject contains "BAMBINI" Or Subject contains "FIGLIO" Or Subject contains "PAOLO" Or Subject contains "NATALE" Or Subject contains "SONO" Or Subject contains "GRAZIA" Or Subject contains "PORNOSTAR" Or Subject contains "PEZZO" Or Subject contains "MERDA" Or Subject contains "CAZZO" Or Subject contains "GALERA" Or Subject contains "SICARIO" Or Subject contains "ESSERE" Or Subject contains "CRIMINALE" Or Subject contains "LECCA" Or Subject contains "COCAINA" Or Subject contains "LESBICA" Or Subject contains "NESSUNO" Or Subject contains "MAFIOSO" Or Subject contains "BERLUSCONI" Or Subject contains "????" Or Subject contains "HARDCORE" Or Subject contains "PEDERASTA" Or Subject contains "CULO" Or Subject contains "NOSTRA" Or Subject contains "FOGLIO" Or Subject contains "USARE" Or Subject contains "FAMIGLIA" Or Subject contains "FECE" Or Subject contains "CAPO" Or Subject contains "SUICIDARE" Or Subject contains "OGNI" Or Subject contains "CANE" Or Subject contains "MERCATO" Or Subject contains "VOLTA" Or Subject contains "MAFIOSA" Or Subject contains "ALMENO" Or Subject contains "BASTARDO" Or Subject contains "FIGLIA" Or Subject contains "BASTARD" Or Subject contains "CRIMINAL" Or Subject contains "ANNI" Or Subject contains "PEDINA" John Black From chris_roysmith at internode.on.net Sun Jul 9 23:29:28 2017 From: chris_roysmith at internode.on.net (Chris Roy-Smith) Date: 10 Jul 2017 03:29:28 GMT Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> On Sun, 09 Jul 2017 21:47:58 -0500, John Black wrote: > In article <477bde19-0653-4e41-a717-0efe90ac5756 at googlegroups.com>, > timetowalk11 at gmail.com says... >> >> I use https://groups.google.com/forum/#!forum/comp.lang.python to look >> over message posts. >> >> What's with all of the Case Solution and Test Bank nonsense posts? >> Is is possible to have these posts filtered out? > > Yes it very easy to filter these out with most usenet readers. Use one > that lets you setup rules for what gets automatically discarded. I've > added these rules to my discard list: > > From contains "Case Solution" > Or From contains "Test Banks" > > Poof. All gone. > > > John Black can you get a newsreader to work with a https news service? From torriem at gmail.com Sun Jul 9 23:32:33 2017 From: torriem at gmail.com (Michael Torrie) Date: Sun, 9 Jul 2017 21:32:33 -0600 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: <5962ddf7$0$1601$c3e8da3$5496439d@news.astraweb.com> References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> <2602b0bd-f357-4973-b3dc-12e63cf66e7d@googlegroups.com> <5962ddf7$0$1601$c3e8da3$5496439d@news.astraweb.com> Message-ID: <22688672-26d7-2e74-4454-ed0eae824f1d@gmail.com> On 07/09/2017 07:52 PM, Steve D'Aprano wrote: > This is also a mailing list where there is an administrator. > > The mailing list came first, and was mirrored to the Usenet group. Good point. So messages that get emailed to the mailing list are filtered before they get sent out to the Usenet group. And messages from the Usenet group can be filtered by the mailing list admins before they are copied back to the mailing list. But messages originating on Usenet by definition are visible to all Usenet folk. So to answer the OP's question, messages originating on Usenet cannot be filtered by a mailing list admin before they are visible to other folks on Usenet. From no.email at nospam.invalid Sun Jul 9 23:39:08 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Sun, 09 Jul 2017 20:39:08 -0700 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: <871sppezab.fsf@nightsong.com> timetowalk11 at gmail.com writes: > What's with all of the Case Solution and Test Bank nonsense posts? > Is is possible to have these posts filtered out? As people have said, you can block them with a good news reader. But like you, I wonder why the heck they have camped out on this particular newsgroup. From dieter at handshake.de Mon Jul 10 01:54:09 2017 From: dieter at handshake.de (dieter) Date: Mon, 10 Jul 2017 07:54:09 +0200 Subject: poplib's POP3_SSL not downloading latest email References: Message-ID: <87y3rwdegu.fsf@handshake.de> Sm Chats writes: > I have a small script which checks whether a mail sent by me to myself(i.e delivered to myself) has the same body as the sent message. The problem is that the sent message (sent by SMTP) isn't received by the POP3_SSL object I'm using. This is likely not a problem with Python or its "poplib": they are only itermediaries between your application and the pop server -- without special logic that may hide some messages. One thing you can check: "poplib" uses a connection to the pop server; it is quite possible that this connection does not see new messages arrived after the connection was opened. Thus, create a new POP object to check for new messages. From wanderer at dialup4less.com Mon Jul 10 07:18:35 2017 From: wanderer at dialup4less.com (Wanderer) Date: Mon, 10 Jul 2017 04:18:35 -0700 (PDT) Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> Message-ID: On Sunday, July 9, 2017 at 8:59:45 PM UTC-4, timetowalk wrote: > On Sunday, July 9, 2017 at 8:05:59 PM UTC-4, Michael Torrie wrote: > > On 07/09/2017 05:39 PM, timetowalk11 at gmail.com wrote: > > > I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > > > > > > What's with all of the Case Solution and Test Bank nonsense posts? > > > Is is possible to have these posts filtered out? > > > > I'm sure Google could filter them if it chose. Behind the group, > > though, is the Usenet newsgroup, which is unmoderated and decentralized, > > and you can't filter there. That's why most people read Usenet with a > > good news reader app that can apply filtering and kill rules for you. > > > > And of course the mailing list side of things does filter out messages > > and kills certain points. Probably your cleanest experience will come > > through the mailing list. Use an email folder and a filter to place all > > list serv emails in a folder and you'll get a good experience. You can > > even do this with Gmail. My gmail account has all my list serv messages > > sorted out on the Gmail server into labels, and I have a special rule to > > killfile messages from some sources. > > I will need to read about filtering messages locally. > Can the admin simply ban the user? I wrote this Python script to locally ban (block, kill file, plonk, whatever you want to call it) authors and posted it here a little while back. This way I don't have to turn on javascript to look at the google groups list. https://groups.google.com/forum/?_escaped_fragment_=msg/comp.lang.python/EKRYfj06OeA/RnpM6stNAwAJ#!msg/comp.lang.python/EKRYfj06OeA/RnpM6stNAwAJ From mscs15059 at itu.edu.pk Mon Jul 10 07:47:34 2017 From: mscs15059 at itu.edu.pk (mscs15059 at itu.edu.pk) Date: Mon, 10 Jul 2017 04:47:34 -0700 (PDT) Subject: AUCPR of individual features using Random Forest (Error: unhashable Type) Message-ID: <74e5ff91-82b4-406c-8269-8b4adba970d3@googlegroups.com> I have a data set of 19 features (v1---v19) and one class label (c1) , I can eaily get the precision recall value of all variables with the class label, but I want the AUCPR of individual features with the class label The data is in this form V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 C1 4182 4182 4182 1 2 0 0 0 4 1 1 0 5 0 1 1 24 4.4654 28.18955043 1 11396 3798.6 3825 3 1 0 1 0 0 3 3 1 0 1 1 3 5 4.452 11.90765492 0 60416 5034.66 5393.5 12 1 0 0 0 0 12 12 3 6 1 4 12 2 4.4711 35.11543135 0 34580 4940 5254 7 1 4 0 2 0 10 12 8 0 1 1 10 45 4.4689 32.44228433 1 8667 4333.5 4333.5 2 1 0 1 0 0 2 2 1 0 1 0 2 1 4.4659 28.79708384 0 4011 4011 4011 1 1 30 0 0 0 2 2 1 8 1 0 2 1 4.4634 25.75941677 0 691347 5083.43 5300 136 2 0 0 0 9 44 44 12 0 1 12 44 32 4.4693 32.92831106 1 So far I have done this from collections import defaultdict from sklearn.cross_validation import train_test_split from sklearn.ensemble import RandomForestClassifier import pandas as pd import numpy as np from sklearn.metrics import average_precision_score mydata = pd.read_csv("TEST_2.csv") y = mydata["C1"] #provided your csv has header row, and the label column is named "Label" ##select all but the last column as data X = mydata.ix[:,:-1] X=X.iloc[:,:] names = X.iloc[:,:].columns.tolist() # -- Gridsearched parameters model_rf = RandomForestClassifier(n_estimators=500, class_weight="auto", criterion='gini', bootstrap=True, max_features=10, min_samples_split=1, min_samples_leaf=6, max_depth=3, n_jobs=-1) scores = defaultdict(list) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0) # -- Fit the model (could be cross-validated) for i in range(X_train.shape[1]): X_t = X_test.copy() rf = model_rf.fit(X_train[:,i], y_train) scores[names[i]] = average_precision_score(y_test, rf.predict(X_t[:,i)) print("Features sorted by their score:") print(sorted([(round(np.mean(score), 4), feat) for feat, score in scores.items()], reverse=True)) It gives the error unhashable type The output should be something like that V1: 0. 82 V2: 0.74 : : V19: 0.55 From steve+python at pearwood.info Mon Jul 10 08:10:37 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Mon, 10 Jul 2017 22:10:37 +1000 Subject: AUCPR of individual features using Random Forest (Error: unhashable Type) References: <74e5ff91-82b4-406c-8269-8b4adba970d3@googlegroups.com> Message-ID: <59636ec0$0$1621$c3e8da3$5496439d@news.astraweb.com> Hi mscs15059 and welcome! (If you'd rather be known with a more friendly name, you can either sign your messages at the end, or configure your email or news software to show your name.) On Mon, 10 Jul 2017 09:47 pm, mscs15059 at itu.edu.pk wrote: > I have a data set of 19 features (v1---v19) and one class label (c1) , I can > eaily get the precision recall value of all variables with the class label, > but I want the AUCPR of individual features with the class label The data is > in this form Unfortunately while we're Python experts here, we're not necessarily pandas experts. I have no idea what AUCPR means. [...] > It gives the error unhashable type What gives the error? Please COPY and PASTE (don't retype from memory) the full error, starting with the line that begins with "Traceback", to the end. That way we can see the exact error you are getting, and more importantly, which line of code is producing it. Then we can start making suggestions for how to fix the problem. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From larry.martell at gmail.com Mon Jul 10 08:18:07 2017 From: larry.martell at gmail.com (Larry Martell) Date: Mon, 10 Jul 2017 08:18:07 -0400 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> <2602b0bd-f357-4973-b3dc-12e63cf66e7d@googlegroups.com> <5962ddf7$0$1601$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Sun, Jul 9, 2017 at 10:37 PM, wrote: > On Sunday, July 9, 2017 at 8:53:18 PM UTC-5, Steve D'Aprano wrote: >> On Mon, 10 Jul 2017 11:41 am, voteswithfeet at gmail.com wrote: >> >> > On Sunday, July 9, 2017 at 7:59:45 PM UTC-5, timetowalk wrote: >> >> >> I will need to read about filtering messages locally. >> >> Can the admin simply ban the user? >> > >> > There is no admin, you idiot. This is Usenet! >> >> That's unnecessarily rude, > > I only meant it as friendly banter. But you're right. Next time I should use a less offensive word like twit. If you really wanted to be less rude you would have said 'upper class twit' From jon+usenet at unequivocal.eu Mon Jul 10 09:10:45 2017 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 10 Jul 2017 13:10:45 -0000 (UTC) Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: On 2017-07-10, John Black wrote: > While you're at it, throw these rules in and the group will appear very > clean and on topic. > > Subject contains "PEDOFILO" > Or > Subject contains "MAI" > Or > Subject contains "SEGRETO" [snip >100 lines of rules] Or just "subject does not contain any lower-case letters". From jon+usenet at unequivocal.eu Mon Jul 10 09:11:31 2017 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 10 Jul 2017 13:11:31 -0000 (UTC) Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <480c78bc-8f43-d31b-3381-977cf5fe5b8e@gmail.com> Message-ID: On 2017-07-09, Michael Torrie wrote: > I'm sure Google could filter them if it chose. Behind the group, > though, is the Usenet newsgroup, which is unmoderated and decentralized, > and you can't filter there. ... unless the group were changed to be moderated, which it really ought to be, being a mirror of a list. From nigel at cresset-group.com Mon Jul 10 09:24:58 2017 From: nigel at cresset-group.com (Nigel Palmer) Date: Mon, 10 Jul 2017 13:24:58 +0000 Subject: Compiling Python 3.6.1 on macOS 10.12.5 Message-ID: Hi I am trying to compile Python 3.6.1 on macOS 10.12.5 with xcode 8.8.3 using the instructions at https://docs.python.org/devguide/setup.html#build-dependencies but I am getting the error ./python.exe -E -S -m sysconfig --generate-posix-vars ;\ if test $? -ne 0 ; then \ echo "generate-posix-vars failed" ; \ rm -f ./pybuilddir.txt ; \ exit 1 ; \ fi /bin/sh: line 1: 96973 Killed: 9 ./python.exe -E -S -m sysconfig --generate-posix-vars generate-posix-vars failed make: *** [pybuilddir.txt] Error 1 When I manually run that command using dbll I get: lldb ./python.exe -- -E -S -m sysconfig --generate-posix-vars (lldb) target create "./python.exe" Current executable set to './python.exe' (x86_64). (lldb) settings set -- target.run-args "-E" "-S" "-m" "sysconfig" "--generate-posix-vars" (lldb) r Process 96978 launched: './python.exe' (x86_64) Could not find platform dependent libraries Consider setting $PYTHONHOME to [:] Process 96978 exited with status = 0 (0x00000000) (lldb) The commands I ran to configure and build python are: brew install openssl xz CPPFLAGS="-I$(brew --prefix openssl)/include" LDFLAGS="-L$(brew --prefix openssl)/lib" ./configure --prefix=`pwd`/../build make Any ideas on what I am doing wrong? Many Thanks Nigel From torriem at gmail.com Mon Jul 10 10:41:12 2017 From: torriem at gmail.com (Michael Torrie) Date: Mon, 10 Jul 2017 08:41:12 -0600 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> Message-ID: <695d0f21-988a-5312-7551-df57ff754e09@gmail.com> On 07/09/2017 09:29 PM, Chris Roy-Smith wrote: > can you get a newsreader to work with a https news service? No. A newsreader works with NNTP protocol. Some Usenet servers offer SSL support over port 443, but that's certainly not https. I've never heard of an https news service before. Unless you're talking about a web-based newsgroups reader, such as google groups. In the latter case you're completely at the mercy of whatever features Google sees fit to throw your way. From jblack at nopam.com Mon Jul 10 11:10:24 2017 From: jblack at nopam.com (John Black) Date: Mon, 10 Jul 2017 10:10:24 -0500 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: In article , jon+usenet at unequivocal.eu says... > > On 2017-07-10, John Black wrote: > > While you're at it, throw these rules in and the group will appear very > > clean and on topic. > > > > Subject contains "PEDOFILO" > > Or > > Subject contains "MAI" > > Or > > Subject contains "SEGRETO" > [snip >100 lines of rules] > > Or just "subject does not contain any lower-case letters". That is probably ok, but I was worried some legit posts would get caught in that. John Black From amka1791 at gmail.com Mon Jul 10 11:31:30 2017 From: amka1791 at gmail.com (amka1791 at gmail.com) Date: Mon, 10 Jul 2017 08:31:30 -0700 (PDT) Subject: ezdxf type of spline Message-ID: <75808a80-10bc-486c-b7bd-a3b4e7ca68fc@googlegroups.com> Hi, Can someone says please to me which kind are the splines of the ezdxf python module ? Is it bezier curves ? Thanks, dylan From jon+usenet at unequivocal.eu Mon Jul 10 12:09:50 2017 From: jon+usenet at unequivocal.eu (Jon Ribbens) Date: Mon, 10 Jul 2017 16:09:50 -0000 (UTC) Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: On 2017-07-10, John Black wrote: > In article , > jon+usenet at unequivocal.eu says... >> On 2017-07-10, John Black wrote: >> > While you're at it, throw these rules in and the group will appear very >> > clean and on topic. >> > >> > Subject contains "PEDOFILO" >> > Or >> > Subject contains "MAI" >> > Or >> > Subject contains "SEGRETO" >> [snip >100 lines of rules] >> >> Or just "subject does not contain any lower-case letters". > > That is probably ok, but I was worried some legit posts would get caught > in that. I don't think that rule would block too many reasonable posts. On the other hand, I also block everything from Google Groups, and, as of recently, everything from netfront.net, so I don't actually worry too much about false positives in this group, given it contains so much trash. From grant.b.edwards at gmail.com Mon Jul 10 12:10:59 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Mon, 10 Jul 2017 16:10:59 +0000 (UTC) Subject: Test 0 and false since false is 0 References: <14ad47d9-630c-4722-b68c-4dabb17350cd@googlegroups.com> Message-ID: On 2017-07-09, Paul D. DeRocco wrote: >> From: Sayth Renshaw >> >> I have been reading this solution >> > >>> after = sorted(before, key=lambda x: x == 0 and type(x) == int) >> >> it is really good, however I don't understand it enough to >> reimplement something like that myself yet. >> >> Though I can that lambda tests for 0 that is equal to an int >> why does sorted put them to the end? > > Because the expression "x == 0 and type(x) == int" has a value of either > False or True, and it sorts all the False values before the True values, > leaving the order within those sets unchanged. > > That said, "x is 0" is even simpler. And wrong. Two equivalent integer objects _might_ be the same object, but that's not guaranteed. It's an _implementation_detail_ of CPython that small integers are cached: >>> x = 0 >>> x is 0 True But larger integers aren't: >>> a = 123412341234 >>> a is 123412341234 False The first example could have returned False and been correct. -- Grant Edwards grant.b.edwards Yow! TONY RANDALL! Is YOUR at life a PATIO of FUN?? gmail.com From phamp at mindspring.com Mon Jul 10 13:02:46 2017 From: phamp at mindspring.com (pyotr filipivich) Date: Mon, 10 Jul 2017 10:02:46 -0700 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> Message-ID: <7fc7mcpboqlvdao7iunq3tejicrllbn7m4@4ax.com> timetowalk11 at gmail.com on Sun, 9 Jul 2017 16:39:44 -0700 (PDT) typed in comp.lang.python the following: >I use https://groups.google.com/forum/#!forum/comp.lang.python to look over message posts. > >What's with all of the Case Solution and Test Bank nonsense posts? >Is is possible to have these posts filtered out? Kill files are your friend. marking all from author: allcasesolutions at gmail.com "Read" works for me. you could delete them. Unfortunately, how to get Google Groups to behave properly is another story entire. -- pyotr filipivich Next month's Panel: Graft - Boon or blessing? From no.email at nospam.invalid Mon Jul 10 13:05:03 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Mon, 10 Jul 2017 10:05:03 -0700 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> <695d0f21-988a-5312-7551-df57ff754e09@gmail.com> Message-ID: <87lgnwdxz4.fsf@nightsong.com> Michael Torrie writes: >> can you get a newsreader to work with a https news service? > No. A newsreader works with NNTP protocol. Traditionally NNTP over SSL was done on port 563. Some feeds now also provide it on 443 to get around client-side firewall hassles. From torriem at gmail.com Mon Jul 10 14:12:20 2017 From: torriem at gmail.com (Michael Torrie) Date: Mon, 10 Jul 2017 12:12:20 -0600 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: <87lgnwdxz4.fsf@nightsong.com> References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> <695d0f21-988a-5312-7551-df57ff754e09@gmail.com> <87lgnwdxz4.fsf@nightsong.com> Message-ID: On 07/10/2017 11:05 AM, Paul Rubin wrote: > Michael Torrie writes: >>> can you get a newsreader to work with a https news service? >> No. A newsreader works with NNTP protocol. > > Traditionally NNTP over SSL was done on port 563. Some feeds now also > provide it on 443 to get around client-side firewall hassles. Yes but that's not via https. From rosuav at gmail.com Mon Jul 10 15:00:53 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 11 Jul 2017 05:00:53 +1000 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> <695d0f21-988a-5312-7551-df57ff754e09@gmail.com> <87lgnwdxz4.fsf@nightsong.com> Message-ID: On Tue, Jul 11, 2017 at 4:12 AM, Michael Torrie wrote: > On 07/10/2017 11:05 AM, Paul Rubin wrote: >> Michael Torrie writes: >>>> can you get a newsreader to work with a https news service? >>> No. A newsreader works with NNTP protocol. >> >> Traditionally NNTP over SSL was done on port 563. Some feeds now also >> provide it on 443 to get around client-side firewall hassles. > > Yes but that's not via https. What's the meaning of "https news service" then? If it's netnews, it's NNTP, not HTTP. If it just happens to be a web app that carries information from a newsgroup, then that's not a news service, it's a web forum, and there's no such thing as a generic web forum API. ChrisA From petef4+usenet at gmail.com Mon Jul 10 16:06:51 2017 From: petef4+usenet at gmail.com (Pete Forman) Date: Mon, 10 Jul 2017 21:06:51 +0100 Subject: What's with all of the Case Solution and Test Bank nonsense posts? References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> <695d0f21-988a-5312-7551-df57ff754e09@gmail.com> <87lgnwdxz4.fsf@nightsong.com> Message-ID: Chris Angelico writes: > On Tue, Jul 11, 2017 at 4:12 AM, Michael Torrie wrote: >> On 07/10/2017 11:05 AM, Paul Rubin wrote: >>> Michael Torrie writes: >>>>> can you get a newsreader to work with a https news service? >>>> No. A newsreader works with NNTP protocol. >>> >>> Traditionally NNTP over SSL was done on port 563. Some feeds now >>> also provide it on 443 to get around client-side firewall hassles. >> >> Yes but that's not via https. > > What's the meaning of "https news service" then? If it's netnews, it's > NNTP, not HTTP. If it just happens to be a web app that carries > information from a newsgroup, then that's not a news service, it's a > web forum, and there's no such thing as a generic web forum API. RFC 4642 (updated by RFC 8143) describes the use of TLS with NNTP. It enhances the connection between NNTP client and server, primarily with encryption but optionally with other benefits. Of course it does nothing to improve the content of Usenet. -- Pete Forman From vinay_sajip at yahoo.co.uk Mon Jul 10 17:40:56 2017 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 10 Jul 2017 21:40:56 +0000 (UTC) Subject: ANN: A new version (0.4.1) of python-gnupg has been released. References: <1241259385.3956741.1499722856981.ref@mail.yahoo.com> Message-ID: <1241259385.3956741.1499722856981@mail.yahoo.com> A new version of the Python module which wraps GnuPG has been released. What Changed? ============= This is an enhancement and bug-fix release, and all users are encouraged to upgrade. See the project website [1] for more information. Brief summary: * Updated message handling logic to no longer raise exceptions when a message isn't ??recognised. Thanks to Daniel Kahn Gillmor for the patch. * Always use always use --fixed-list-mode, --batch and --with-colons. Thanks to Daniel ??Kahn Gillmor for the patch. * Improved scan_keys() handling on GnuPG >= 2.1. Thanks to Daniel Kahn Gillmor for the ??patch. * Improved test behaviour with GnuPG >= 2.1. Failures when deleting test directory trees ??are now ignored. Thanks to Daniel Kahn Gillmor for the patch. * Added close_file keyword argument to verify_file to allow the file closing to be made ??optional. Current behaviour is maintained - close_file=False can be passed to skip ??closing the file being verified. * Added the extra_args keyword parameter to allow custom arguments to be passed to the ??gpg executable. * Instances of the GPG class now have an additional on_data attribute, which defaults to ??None. It can be set to a callable which will be called with a single argument - a binary ??chunk of data received from the gpg executable. The callable can do whatever it likes ??with the chunks passed to it - e.g. write them to a separate stream. The callable should ??not raise any exceptions (unless it wants the current operation to fail). This release [2] has been signed with my code signing key: Vinay Sajip (CODE SIGNING KEY) Fingerprint: CA74 9061 914E AC13 8E66 EADB 9147 B477 339A 9B86 What Does It Do? ================ The gnupg module allows Python programs to make use of the functionality provided by the Gnu Privacy Guard (abbreviated GPG or GnuPG). Using this module, Python programs can encrypt and decrypt data, digitally sign documents and verify digital signatures, manage (generate, list and delete) encryption keys, using proven Public Key Infrastructure (PKI) encryption technology based on OpenPGP. This module is expected to be used with Python versions >= 2.4, as it makes use of the subprocess module which appeared in that version of Python. This module is a newer version derived from earlier work by Andrew Kuchling, Richard Jones and Steve Traugott. A test suite using unittest is included with the source distribution. Simple usage: >>> import gnupg >>> gpg = gnupg.GPG(gnupghome='/path/to/keyring/directory') >>> gpg.list_keys() [{ ... 'fingerprint': 'F819EE7705497D73E3CCEE65197D5DAC68F1AAB2', 'keyid': '197D5DAC68F1AAB2', 'length': '1024', 'type': 'pub', 'uids': ['', 'Gary Gross (A test user) ']}, { ... 'fingerprint': '37F24DD4B918CC264D4F31D60C5FEFA7A921FC4A', 'keyid': '0C5FEFA7A921FC4A', 'length': '1024', ... 'uids': ['', 'Danny Davis (A test user) ']}] >>> encrypted = gpg.encrypt("Hello, world!", ['0C5FEFA7A921FC4A']) >>> str(encrypted) '-----BEGIN PGP MESSAGE-----\nVersion: GnuPG v1.4.9 (GNU/Linux)\n \nhQIOA/6NHMDTXUwcEAf . -----END PGP MESSAGE-----\n' >>> decrypted = gpg.decrypt(str(encrypted), passphrase='secret') >>> str(decrypted) 'Hello, world!' >>> signed = gpg.sign("Goodbye, world!", passphrase='secret') >>> verified = gpg.verify(str(signed)) >>> print "Verified" if verified else "Not verified" 'Verified' As always, your feedback is most welcome (especially bug reports [3], patches and suggestions for improvement, or any other points via the mailing list/discussion group [4]). Please refer to the documentation [5] for more information. Enjoy! Cheers Vinay Sajip Red Dove Consultants Ltd. [1] https://bitbucket.org/vinay.sajip/python-gnupg [2] https://pypi.python.org/pypi/python-gnupg/0.4.1 [3] https://bitbucket.org/vinay.sajip/python-gnupg/issues [4] https://groups.google.com/forum/#!forum/python-gnupg [5] https://gnupg.readthedocs.io/en/latest/? From songofacandy at gmail.com Mon Jul 10 21:57:27 2017 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 11 Jul 2017 10:57:27 +0900 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: > Killed: 9 It looks like not segmentation fault. Maybe, RAM shortage? INADA Naoki On Mon, Jul 10, 2017 at 10:24 PM, Nigel Palmer wrote: > Hi > > I am trying to compile Python 3.6.1 on macOS 10.12.5 with xcode 8.8.3 using the instructions at https://docs.python.org/devguide/setup.html#build-dependencies but I am getting the error > > ./python.exe -E -S -m sysconfig --generate-posix-vars ;\ > if test $? -ne 0 ; then \ > echo "generate-posix-vars failed" ; \ > rm -f ./pybuilddir.txt ; \ > exit 1 ; \ > fi > /bin/sh: line 1: 96973 Killed: 9 ./python.exe -E -S -m sysconfig --generate-posix-vars > generate-posix-vars failed > make: *** [pybuilddir.txt] Error 1 > > When I manually run that command using dbll I get: > > lldb ./python.exe -- -E -S -m sysconfig --generate-posix-vars > (lldb) target create "./python.exe" > Current executable set to './python.exe' (x86_64). > (lldb) settings set -- target.run-args "-E" "-S" "-m" "sysconfig" "--generate-posix-vars" > (lldb) r > Process 96978 launched: './python.exe' (x86_64) > Could not find platform dependent libraries > Consider setting $PYTHONHOME to [:] > Process 96978 exited with status = 0 (0x00000000) > (lldb) > > > The commands I ran to configure and build python are: > brew install openssl xz > CPPFLAGS="-I$(brew --prefix openssl)/include" LDFLAGS="-L$(brew --prefix openssl)/lib" ./configure --prefix=`pwd`/../build > make > > Any ideas on what I am doing wrong? > > Many Thanks > Nigel > -- > https://mail.python.org/mailman/listinfo/python-list From torriem at gmail.com Mon Jul 10 22:55:14 2017 From: torriem at gmail.com (Michael Torrie) Date: Mon, 10 Jul 2017 20:55:14 -0600 Subject: What's with all of the Case Solution and Test Bank nonsense posts? In-Reply-To: References: <477bde19-0653-4e41-a717-0efe90ac5756@googlegroups.com> <5962f497$0$2779$c3e8da3$76491128@news.astraweb.com> <695d0f21-988a-5312-7551-df57ff754e09@gmail.com> <87lgnwdxz4.fsf@nightsong.com> Message-ID: <2bf36866-fd2e-9c1f-1aad-2e0a60efea32@gmail.com> On 07/10/2017 02:06 PM, Pete Forman wrote: >> What's the meaning of "https news service" then? If it's netnews, it's >> NNTP, not HTTP. If it just happens to be a web app that carries >> information from a newsgroup, then that's not a news service, it's a >> web forum, and there's no such thing as a generic web forum API. > > RFC 4642 (updated by RFC 8143) describes the use of TLS with NNTP. It > enhances the connection between NNTP client and server, primarily with > encryption but optionally with other benefits. Again, that is not HTTPS. Don't confuse port number with protocol! As Chris says, Google Groups is essentially a web forum that happens to mirror Usenet content. It's not a news service itself (no NNTP), not an NNTP server. So if GG is causing users problems, the solution is to ditch it and use something better, like a standalone NNTP client and a real Usenet server. Or python-list via email. I wish there was a decent (free) API that everyone used for web forums that would let me use a decent, threaded news reader to browse forums. I once wrote a forum scraper using BeautifulSoup that downloaded posts, along with graphics and made them into MIME messages served over NNTP (using Twisted) to my NNTP client. So much information is lost in web forums (there's no threading in web forums) that it didn't work out that well. From steve at pearwood.info Tue Jul 11 02:11:13 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 11 Jul 2017 06:11:13 GMT Subject: Write this accumuator in a functional style Message-ID: <59646c01$0$11093$c3e8da3@news.astraweb.com> I have a colleague who is allergic to mutating data structures. Yeah, I know, he needs to just HTFU but I thought I'd humour him. Suppose I have an iterator that yields named tuples: Parrot(colour='blue', species='Norwegian', status='tired and shagged out') and I want to collect them by colour: accumulator = {'blue': [], 'green': [], 'red': []} for parrot in parrots: accumulator[parrot.colour].append(parrot) That's pretty compact and understandable, but it require mutating a bunch of pre-allocated lists inside an accumulator. Can we re-write this in a functional style? The obvious answer is "put it inside a function, then pretend it works by magic" but my colleague's reply to that is "Yes, but I'll know that its actually doing mutation inside the function". Help me humour my colleague. -- Steve From greg.ewing at canterbury.ac.nz Tue Jul 11 02:36:48 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Tue, 11 Jul 2017 18:36:48 +1200 Subject: Write this accumuator in a functional style In-Reply-To: <59646c01$0$11093$c3e8da3@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: Steven D'Aprano wrote: > Help me humour my colleague. class Parrot: def __init__(self, color, species, status): self.color = color self.species = species self.status = status def __repr__(self): return "%s/%s/%s" % (self.species, self.color, self.status) def parrot_generator(): yield Parrot(color='blue', species='Norwegian', status='tired and shagged out') yield Parrot(color='green', species='Portugese', status='perky') yield Parrot(color='red', species='Hawaiian', status='laid back') yield Parrot(color='blue', species='Norwegian', status='dead') yield Parrot(color='green', species='Italian', status='excited') yield Parrot(color='blue', species='French', status='tres bon') def parrots_of_color(parrots, color): return [p for p in parrots if p.color == color] def accumulated_parrots(parrot_source): parrots = list(parrot_source) return {color: parrots_of_color(parrots, color) for color in set(p.color for p in parrots)} print(accumulated_parrots(parrot_generator())) From wolfgang.maier at biologie.uni-freiburg.de Tue Jul 11 02:47:12 2017 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Tue, 11 Jul 2017 08:47:12 +0200 Subject: Write this accumuator in a functional style In-Reply-To: <59646c01$0$11093$c3e8da3@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: On 07/11/2017 08:11 AM, Steven D'Aprano wrote: > I have a colleague who is allergic to mutating data structures. Yeah, I > know, he needs to just HTFU but I thought I'd humour him. > > Suppose I have an iterator that yields named tuples: > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > and I want to collect them by colour: > > accumulator = {'blue': [], 'green': [], 'red': []} > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? > > The obvious answer is "put it inside a function, then pretend it works by > magic" but my colleague's reply to that is "Yes, but I'll know that its > actually doing mutation inside the function". > > > Help me humour my colleague. > > > Hmm, isn't this just asking for itertools.groupby on the parrots sorted by colour? From kwpolska at gmail.com Tue Jul 11 02:48:26 2017 From: kwpolska at gmail.com (Chris Warrick) Date: Tue, 11 Jul 2017 08:48:26 +0200 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: Why are you trying to compile Python manually? You should use Homebrew to install Python in 99% of cases. (The package is python3) -- Chris Warrick From rosuav at gmail.com Tue Jul 11 02:58:51 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 11 Jul 2017 16:58:51 +1000 Subject: Write this accumuator in a functional style In-Reply-To: <59646c01$0$11093$c3e8da3@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: On Tue, Jul 11, 2017 at 4:11 PM, Steven D'Aprano wrote: > I have a colleague who is allergic to mutating data structures. Yeah, I > know, he needs to just HTFU but I thought I'd humour him. > > Suppose I have an iterator that yields named tuples: > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > and I want to collect them by colour: > > accumulator = {'blue': [], 'green': [], 'red': []} > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? > > The obvious answer is "put it inside a function, then pretend it works by > magic" but my colleague's reply to that is "Yes, but I'll know that its > actually doing mutation inside the function". It's a partitioning filter. (Three way, not the usual two, but same same.) I've actually often wanted a quick way to write that - where you divide a list into two according to "passes predicate" vs "fails predicate". So if you find a really nice solution, I'm interested. ChrisA From __peter__ at web.de Tue Jul 11 03:13:59 2017 From: __peter__ at web.de (Peter Otten) Date: Tue, 11 Jul 2017 09:13:59 +0200 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: Steven D'Aprano wrote: > I have a colleague who is allergic to mutating data structures. Yeah, I > know, he needs to just HTFU but I thought I'd humour him. > > Suppose I have an iterator that yields named tuples: > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > and I want to collect them by colour: > > accumulator = {'blue': [], 'green': [], 'red': []} > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? > > The obvious answer is "put it inside a function, then pretend it works by > magic" but my colleague's reply to that is "Yes, but I'll know that its > actually doing mutation inside the function". > > > Help me humour my colleague. Wouldn't it be on your colleague to provide a competetive "functional" version? However, reusing Gregory's sample data: >>> def color(p): return p.color ... >>> {c: list(ps) for c, ps in groupby( ... sorted(parrot_generator(), key=color), key=color)} {'red': [Hawaiian/red/laid back], 'blue': [Norwegian/blue/tired and shagged out, Norwegian/blue/dead, French/blue/tres bon], 'green': [Portugese/green/perky, Italian/green/excited]} From tjreedy at udel.edu Tue Jul 11 04:48:35 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 11 Jul 2017 04:48:35 -0400 Subject: Write this accumuator in a functional style In-Reply-To: <59646c01$0$11093$c3e8da3@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: On 7/11/2017 2:11 AM, Steven D'Aprano wrote: > I have a colleague who is allergic to mutating data structures. Yeah, I > know, he needs to just HTFU but I thought I'd humour him. > > Suppose I have an iterator that yields named tuples: > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > and I want to collect them by colour: > > accumulator = {'blue': [], 'green': [], 'red': []} > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? > > The obvious answer is "put it inside a function, then pretend it works by > magic" but my colleague's reply to that is "Yes, but I'll know that its > actually doing mutation inside the function". > > > Help me humour my colleague. To truly not mutate anything, not even hidden (as in a python list comp, which buries .append), replace list with linked-list and .append with head-linkage (Lisp's cons). Something like blue, green, red = None, None, None for parrot in parrots: color = parrot.color if color == 'blue': blue = (parrot, blue) elif color = 'red': red = (parrot, red) elif color = 'green': green = (parrot, green) else: raise ValueError(f'parrot {parrot} has unknown color {color}') At this point, blue, green, and red are linked lists of parrots of the corresponding color. Of course, for loops mutate the iterator. Replace that with a tail recursive function. To make this easy, the input 'parrots' should be a linked list, which is a recursive data structure. Assignment is also form of mutation (of a namespace dict), so to be really strict, all the assignments should be replaced by function calls and returns. That really requires the use of recursion. Since this is a thought experiment we can make this easier: Make accumulator a linked list, perhaps (('blue':None),(('green',None),(( 'red', None), None))) Specify that parrots is also a linked list. Now stipulate the we are using func_py, which compiles Python syntax such as for parrot in parrots: accumulator[parrot.colour].append(parrot) into a set of recursive functional functions, including tail recursive 'for' such that for(parrots, accumulator) returns a new linked-list. Note that real functional language compilers do the opposite of this, compiling tail-recursive syntax into while loops. -- Terry Jan Reedy From alain at universite-de-strasbourg.fr.invalid Tue Jul 11 06:41:24 2017 From: alain at universite-de-strasbourg.fr.invalid (Alain Ketterlin) Date: Tue, 11 Jul 2017 12:41:24 +0200 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: <87vamzqmqz.fsf@universite-de-strasbourg.fr.invalid> Steven D'Aprano writes: > I have a colleague who is allergic to mutating data structures. Yeah, I > know, he needs to just HTFU but I thought I'd humour him. > > Suppose I have an iterator that yields named tuples: > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > and I want to collect them by colour: > > accumulator = {'blue': [], 'green': [], 'red': []} > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? Here is a sketch in OCaml-style (incomplete of course): type color = Blue | Green | Red;; type parrot = { c: color; ... };; let rec collect list_of_parrots = match list_of_parrots with | nil -> (nil,nil,nil) | h :: q -> let b,g,r = collect q in match h with | {c=Blue} -> (h::b,g,r) | {c=Green} -> (b,h::g,r) | {c=Red} -> (b,g,h::r) ;; The function returns a triple of lists in this case. The first match drives list-traversal, the second selects between colors. Both can be (sub-optimally) turned into cascades of "if", i.e., in Python: def collect(list_of_parrots): if not list_of_parrots: return [],[],[] else b,g,r = collect(list_of_parrots[1:]) h = list_of_parrots[0] if h.color == 'blue': return ([h]+b,g,r) elif h.color == 'green': ... and so on -- Alain. From viktor.hagstrom at outlook.com Tue Jul 11 07:58:48 2017 From: viktor.hagstrom at outlook.com (=?utf-8?B?VmlrdG9yIEhhZ3N0csO2bQ==?=) Date: Tue, 11 Jul 2017 11:58:48 +0000 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: >Why are you trying to compile Python manually? You should use Homebrew to >install Python in 99% of cases. (The package is python3) I'm not the person you answered, but I can explain why I do things that are not "optimal" or "easy" or "best". I am interested, I want to learn something, I think it's fun to do things the hard way. I want to learn how it *really* works. I want to modify it, break it, and fix it again. And I think it would be good for this community to encourage that (as long as it doesn't hinder their progress). 2017-07-11 8:48 GMT+02:00 Chris Warrick >: Why are you trying to compile Python manually? You should use Homebrew to install Python in 99% of cases. (The package is python3) -- Chris Warrick -- https://mail.python.org/mailman/listinfo/python-list From viktor.hagstrom at outlook.com Tue Jul 11 07:58:49 2017 From: viktor.hagstrom at outlook.com (=?utf-8?B?VmlrdG9yIEhhZ3N0csO2bQ==?=) Date: Tue, 11 Jul 2017 11:58:49 +0000 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: >Why are you trying to compile Python manually? You should use Homebrew to >install Python in 99% of cases. (The package is python3) I'm not the person you answered, but I can explain why I do things that are not "optimal" or "easy" or "best". I am interested, I want to learn something, I think it's fun to do things the hard way. I want to learn how it *really* works. I want to modify it, break it, and fix it again. And I think it would be good for this community to encourage that (as long as it doesn't hinder their progress). 2017-07-11 8:48 GMT+02:00 Chris Warrick >: Why are you trying to compile Python manually? You should use Homebrew to install Python in 99% of cases. (The package is python3) -- Chris Warrick -- https://mail.python.org/mailman/listinfo/python-list From viktor.hagstrom at outlook.com Tue Jul 11 07:58:52 2017 From: viktor.hagstrom at outlook.com (=?utf-8?B?VmlrdG9yIEhhZ3N0csO2bQ==?=) Date: Tue, 11 Jul 2017 11:58:52 +0000 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: >Why are you trying to compile Python manually? You should use Homebrew to >install Python in 99% of cases. (The package is python3) I'm not the person you answered, but I can explain why I do things that are not "optimal" or "easy" or "best". I am interested, I want to learn something, I think it's fun to do things the hard way. I want to learn how it *really* works. I want to modify it, break it, and fix it again. And I think it would be good for this community to encourage that (as long as it doesn't hinder their progress). 2017-07-11 8:48 GMT+02:00 Chris Warrick >: Why are you trying to compile Python manually? You should use Homebrew to install Python in 99% of cases. (The package is python3) -- Chris Warrick -- https://mail.python.org/mailman/listinfo/python-list From ksatish.dtc at gmail.com Tue Jul 11 08:21:55 2017 From: ksatish.dtc at gmail.com (ksatish.dtc at gmail.com) Date: Tue, 11 Jul 2017 05:21:55 -0700 (PDT) Subject: Can anybody help me retrieve how to retrieve output from this Python code below! Message-ID: try: import unicodecsv as csv except ImportError: import csv import json import operator import os from collections import OrderedDict import logging logging.basicConfig(level=logging.DEBUG) class Json2Csv(object): """Process a JSON object to a CSV file""" collection = None # Better for single-nested dictionaries SEP_CHAR = ', ' KEY_VAL_CHAR = ': ' DICT_SEP_CHAR = '\r' DICT_OPEN = '' DICT_CLOSE = '' # Better for deep-nested dictionaries # SEP_CHAR = ', ' # KEY_VAL_CHAR = ': ' # DICT_SEP_CHAR = '; ' # DICT_OPEN = '{ ' # DICT_CLOSE = '} ' def __init__(self, outline): self.rows = [] if not isinstance(outline, dict): raise ValueError('You must pass in an outline for JSON2CSV to follow') elif 'map' not in outline or len(outline['map']) < 1: raise ValueError('You must specify at least one value for "map"') key_map = OrderedDict() for header, key in outline['map']: splits = key.split('.') splits = [int(s) if s.isdigit() else s for s in splits] key_map[header] = splits self.key_map = key_map if 'collection' in outline: self.collection = outline['collection'] def load(self, json_file): self.process_each(json.load(json_file)) def process_each(self, data): """Process each item of a json-loaded dict """ if self.collection and self.collection in data: data = data[self.collection] for d in data: logging.info(d) self.rows.append(self.process_row(d)) def process_row(self, item): """Process a row of json data against the key map """ row = {} for header, keys in self.key_map.items(): try: row[header] = reduce(operator.getitem, keys, item) except (KeyError, IndexError, TypeError): row[header] = None return row def make_strings(self): str_rows = [] for row in self.rows: str_rows.append({k: self.make_string(val) for k, val in row.items()}) return str_rows def make_string(self, item): if isinstance(item, list) or isinstance(item, set) or isinstance(item, tuple): return self.SEP_CHAR.join([self.make_string(subitem) for subitem in item]) elif isinstance(item, dict): return self.DICT_OPEN + self.DICT_SEP_CHAR.join([self.KEY_VAL_CHAR.join([k, self.make_string(val)]) for k, val in item.items()]) + self.DICT_CLOSE else: return unicode(item) def write_csv(self, filename='output.csv', make_strings=False): """Write the processed rows to the given filename """ if (len(self.rows) <= 0): raise AttributeError('No rows were loaded') if make_strings: out = self.make_strings() else: out = self.rows with open(filename, 'wb+') as f: writer = csv.DictWriter(f, self.key_map.keys()) writer.writeheader() writer.writerows(out) class MultiLineJson2Csv(Json2Csv): def load(self, json_file): self.process_each(json_file) def process_each(self, data, collection=None): """Load each line of an iterable collection (ie. file)""" for line in data: d = json.loads(line) if self.collection in d: d = d[self.collection] self.rows.append(self.process_row(d)) def init_parser(): import argparse parser = argparse.ArgumentParser(description="Converts JSON to CSV") parser.add_argument('json_file', type=argparse.FileType('r'), help="Path to JSON data file to load") parser.add_argument('key_map', type=argparse.FileType('r'), help="File containing JSON key-mapping file to load") parser.add_argument('-e', '--each-line', action="store_true", default=False, help="Process each line of JSON file separately") parser.add_argument('-o', '--output-csv', type=str, default=None, help="Path to csv file to output") parser.add_argument( '--strings', help="Convert lists, sets, and dictionaries fully to comma-separated strings.", action="store_true", default=True) return parser json_file = input("Type Json input file name: ") key_map = input("Type Key value : ") MultiLineJson2Csv(Json2Csv).init_parser() Json2Csv.load(json_file) if __name__ == '__main__': parser = init_parser() args = parser.parse_args() key_map = json.load(args.key_map) loader = None if args.each_line: loader = MultiLineJson2Csv(key_map) else: loader = Json2Csv(key_map) loader.load(args.json_file) outfile = args.output_csv if outfile is None: fileName, fileExtension = os.path.splitext(args.json_file.name) outfile = fileName + '.csv' loader.write_csv(filename=outfile, make_strings=args.strings) From sjeik_appie at hotmail.com Tue Jul 11 09:16:40 2017 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Tue, 11 Jul 2017 13:16:40 +0000 Subject: Test 0 and false since false is 0 In-Reply-To: References: , Message-ID: From: Python-list on behalf of Dan Sommers Sent: Friday, July 7, 2017 2:46 AM To: python-list at python.org Subject: Re: Test 0 and false since false is 0 ? On Thu, 06 Jul 2017 19:29:00 -0700, Sayth Renshaw wrote: > I have tried or conditions of v == False etc but then the 0's being > false also aren't moved. How can you check this at once? Maybe this will help: ??? Python 3.5.3+ (default, Jun? 7 2017, 23:23:48) ??? [GCC 6.3.0 20170516] on linux ??? Type "help", "copyright", "credits" or "license" for more information. ??? >>> False == 0 ??? True ??? >>> False is 0 ??? False =====> Just wondering: Is this 'is' test depending on an implementation detail of cPython (small ints, I forgot how small 0-255 maybe, are singletons)? From __peter__ at web.de Tue Jul 11 09:26:03 2017 From: __peter__ at web.de (Peter Otten) Date: Tue, 11 Jul 2017 15:26:03 +0200 Subject: Can anybody help me retrieve how to retrieve output from this Python code below! References: Message-ID: ksatish.dtc at gmail.com wrote: [snip code] Wasn't there any documentation to go with that script? That's the preferable method to use software written by someone else ;) Anyway -- First you have to undo what was probably changed by yourself: $ diff -u json2csv_orig.py json2csv.py --- json2csv_orig.py 2017-07-11 15:15:06.527571509 +0200 +++ json2csv.py 2017-07-11 15:14:17.878514787 +0200 @@ -132,14 +132,6 @@ return parser -json_file = input("Type Json input file name: ") - -key_map = input("Type Key value : ") - -MultiLineJson2Csv(Json2Csv).init_parser() - -Json2Csv.load(json_file) - if __name__ == '__main__': parser = init_parser() @@ -159,4 +151,4 @@ fileName, fileExtension = os.path.splitext(args.json_file.name) outfile = fileName + '.csv' -loader.write_csv(filename=outfile, make_strings=args.strings) + loader.write_csv(filename=outfile, make_strings=args.strings) Then you have to create a file containing the data in json format, e. g. $ cat data.json [ ["alpha", "beta", {"one": {"two": "gamma"}}], ["zeta", "eta", {"one": {"two": "theta"}}] ] ...and a file describing the conversion, also in json, like $ cat key_map.json { "map": [ ["foo", "0"], ["bar", "1"], ["baz", "2.one.two"] ] } Now you can run the script, first to look at the command line help $ python json2csv.py -h usage: json2csv.py [-h] [-e] [-o OUTPUT_CSV] [--strings] json_file key_map Converts JSON to CSV positional arguments: json_file Path to JSON data file to load key_map File containing JSON key-mapping file to load optional arguments: -h, --help show this help message and exit -e, --each-line Process each line of JSON file separately -o OUTPUT_CSV, --output-csv OUTPUT_CSV Path to csv file to output --strings Convert lists, sets, and dictionaries fully to comma-separated strings. and then to process your data: $ python json2csv.py data.json key_map.json INFO:root:[u'alpha', u'beta', {u'one': {u'two': u'gamma'}}] INFO:root:[u'zeta', u'eta', {u'one': {u'two': u'theta'}}] Finally, let's have a look at the resulting csv file: $ cat data.csv foo,bar,baz alpha,beta,gamma zeta,eta,theta From nigel at cresset-group.com Tue Jul 11 09:46:05 2017 From: nigel at cresset-group.com (Nigel Palmer) Date: Tue, 11 Jul 2017 13:46:05 +0000 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: Hi Chris I am planning on embedding Python into a C++ application and I wanted to have my own build of Python to do that. I know that eventually I will need to use --enable-shared or --enable-framework but for now I am trying to get a the simpler static build to compile first. Thanks, Nigel From: Chris Warrick [mailto:kwpolska at gmail.com] Sent: 11 July 2017 07:48 To: Nigel Palmer Cc: python-list at python.org Subject: Re: Compiling Python 3.6.1 on macOS 10.12.5 Why are you trying to compile Python manually? You should use Homebrew to install Python in 99% of cases. (The package is python3) -- Chris Warrick From nigel at cresset-group.com Tue Jul 11 09:53:17 2017 From: nigel at cresset-group.com (Nigel Palmer) Date: Tue, 11 Jul 2017 13:53:17 +0000 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: Hi The python process only goes to around 4.8 MB before it dies and the machine has 4GB of RAM so I do not think it?s a memory issue. Thanks, Nigel -----Original Message----- From: INADA Naoki [mailto:songofacandy at gmail.com] Sent: 11 July 2017 02:57 To: Nigel Palmer Cc: python-list at python.org Subject: Re: Compiling Python 3.6.1 on macOS 10.12.5 > Killed: 9 It looks like not segmentation fault. Maybe, RAM shortage? INADA Naoki On Mon, Jul 10, 2017 at 10:24 PM, Nigel Palmer wrote: > Hi > > I am trying to compile Python 3.6.1 on macOS 10.12.5 with xcode 8.8.3 > using the instructions at > https://docs.python.org/devguide/setup.html#build-dependencies but I > am getting the error > > ./python.exe -E -S -m sysconfig --generate-posix-vars ;\ > if test $? -ne 0 ; then \ > echo "generate-posix-vars failed" ; \ > rm -f ./pybuilddir.txt ; \ > exit 1 ; \ > fi > /bin/sh: line 1: 96973 Killed: 9 ./python.exe -E -S -m sysconfig --generate-posix-vars > generate-posix-vars failed > make: *** [pybuilddir.txt] Error 1 > > When I manually run that command using dbll I get: > > lldb ./python.exe -- -E -S -m sysconfig --generate-posix-vars > (lldb) target create "./python.exe" > Current executable set to './python.exe' (x86_64). > (lldb) settings set -- target.run-args "-E" "-S" "-m" "sysconfig" "--generate-posix-vars" > (lldb) r > Process 96978 launched: './python.exe' (x86_64) Could not find > platform dependent libraries Consider setting > $PYTHONHOME to [:] Process 96978 exited with > status = 0 (0x00000000) > (lldb) > > > The commands I ran to configure and build python are: > brew install openssl xz > CPPFLAGS="-I$(brew --prefix openssl)/include" LDFLAGS="-L$(brew > --prefix openssl)/lib" ./configure --prefix=`pwd`/../build make > > Any ideas on what I am doing wrong? > > Many Thanks > Nigel > -- > https://mail.python.org/mailman/listinfo/python-list From rhodri at kynesim.co.uk Tue Jul 11 10:05:18 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 11 Jul 2017 15:05:18 +0100 Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: On 11/07/17 14:16, Albert-Jan Roskam wrote: > From: Python-list on behalf of Dan Sommers > Sent: Friday, July 7, 2017 2:46 AM > To: python-list at python.org > Subject: Re: Test 0 and false since false is 0 > > On Thu, 06 Jul 2017 19:29:00 -0700, Sayth Renshaw wrote: > >> I have tried or conditions of v == False etc but then the 0's being >> false also aren't moved. How can you check this at once? > > Maybe this will help: > > Python 3.5.3+ (default, Jun 7 2017, 23:23:48) > [GCC 6.3.0 20170516] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> False == 0 > True > >>> False is 0 > False > > > =====> Just wondering: Is this 'is' test depending on an implementation detail of cPython (small ints, I forgot how small 0-255 maybe, are singletons)? No, it's not an implementation detail. True and False are singletons, and are different objects from 1 and 0 (they behave differently when converted to strings, for example). Now, relying on "0 is 0" being true would be relying on an implementation detail, but you weren't going to do that, were you? -- Rhodri James *-* Kynesim Ltd From grant.b.edwards at gmail.com Tue Jul 11 10:11:36 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Tue, 11 Jul 2017 14:11:36 +0000 (UTC) Subject: Test 0 and false since false is 0 References: Message-ID: On 2017-07-11, Albert-Jan Roskam wrote: > From: Python-list on behalf of Dan Sommers > Sent: Friday, July 7, 2017 2:46 AM > To: python-list at python.org > Subject: Re: Test 0 and false since false is 0 > ? > On Thu, 06 Jul 2017 19:29:00 -0700, Sayth Renshaw wrote: > >> I have tried or conditions of v == False etc but then the 0's being >> false also aren't moved. How can you check this at once? > > Maybe this will help: > > ??? Python 3.5.3+ (default, Jun? 7 2017, 23:23:48) > ??? [GCC 6.3.0 20170516] on linux > ??? Type "help", "copyright", "credits" or "license" for more information. > ??? >>> False == 0 > ??? True > ??? >>> False is 0 > ??? False > > > Just wondering: Is this 'is' test depending on an implementation > detail of cPython (small ints, I forgot how small 0-255 maybe, are > singletons)? No. False is required to be a singleton. Therefore, if you want to know if is the boolean object False, you can use ' is False' with predictable results.. Integer values are not required to be singletons, so you cannot depend on the value of is 0, or is 12345678. As you mention, in the current version(s) of CPython, small integer values are cached, but larger ones are not: $ python Python 2.7.12 (default, Jan 3 2017, 10:08:10) [GCC 4.9.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x = 0 >>> x is 0 True >>> x = 12345678 >>> x is 12345678 False >>> That could change tomorrow. -- Grant Edwards grant.b.edwards Yow! PEGGY FLEMMING is at stealing BASKET BALLS to gmail.com feed the babies in VERMONT. From ganesh1pal at gmail.com Tue Jul 11 12:31:53 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Tue, 11 Jul 2017 22:01:53 +0530 Subject: Better Regex and exception handling for this small code Message-ID: Dear Python friends I am trying to open a file and check if there is a pattern has changed after the task got completed? file data: ........................................................ #tail -f /file.txt .......................................... Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, journal_crc = d2097b00 Note: Task completed successfully. Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, journal_crc = a176af10 I have the below piece of code but would like to make this better more pythonic , I found regex pattern and exception handling poor here , any quick suggestion in your spare time is welcome. #open the existing file if the flag is set and check if there is a match log_file='/file.txt' flag_is_on=1 data = None with open(log_file, 'r') as f: data = f.readlines() if flag_is_on: logdata = '\n'.join(data) reg = "initiator_crc =(?P[\s\S]*?), journal_crc" crc = re.findall(re.compile(reg), logdata) if not crc: raise Exception("Pattern not found in logfile") checksumbefore = crc[0].strip() checksumafter = crc[1].strip() logging.info("checksumbefore :%s and checksumafter:%s" % (checksumbefore, checksumafter)) if checksumbefore == checksumafter: raise Exception("checksum not macthing") I am on Linux and Python 2.7 Regards, Ganesh From ganesh1pal at gmail.com Tue Jul 11 12:33:04 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Tue, 11 Jul 2017 22:03:04 +0530 Subject: Better Regex and exception handling for this small code In-Reply-To: References: Message-ID: I am trying to open a file and check if the pattern i.e initiator_crc has changed after the task got completed? * On Tue, Jul 11, 2017 at 10:01 PM, Ganesh Pal wrote: > Dear Python friends > > I am trying to open a file and check if there is a pattern has changed > after the task got completed? > > file data: > ........................................................ > > #tail -f /file.txt > .......................................... > Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, > journal_crc = d2097b00 > Note: Task completed successfully. > Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, > journal_crc = a176af10 > > > I have the below piece of code but would like to make this better more > pythonic , I found regex pattern and exception handling poor here , any > quick suggestion in your spare time is welcome. > > > #open the existing file if the flag is set and check if there is a match > > log_file='/file.txt' > flag_is_on=1 > > data = None > with open(log_file, 'r') as f: > data = f.readlines() > > > if flag_is_on: > logdata = '\n'.join(data) > reg = "initiator_crc =(?P[\s\S]*?), journal_crc" > crc = re.findall(re.compile(reg), logdata) > if not crc: > raise Exception("Pattern not found in logfile") > > checksumbefore = crc[0].strip() > checksumafter = crc[1].strip() > logging.info("checksumbefore :%s and checksumafter:%s" > % (checksumbefore, checksumafter)) > > if checksumbefore == checksumafter: > raise Exception("checksum not macthing") > > I am on Linux and Python 2.7 > > Regards, > Ganesh > > From ian.g.kelly at gmail.com Tue Jul 11 13:18:45 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Tue, 11 Jul 2017 11:18:45 -0600 Subject: Write this accumuator in a functional style In-Reply-To: References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: On Tue, Jul 11, 2017 at 12:47 AM, Wolfgang Maier wrote: > On 07/11/2017 08:11 AM, Steven D'Aprano wrote: >> >> I have a colleague who is allergic to mutating data structures. Yeah, I >> know, he needs to just HTFU but I thought I'd humour him. >> >> Suppose I have an iterator that yields named tuples: >> >> Parrot(colour='blue', species='Norwegian', status='tired and shagged out') >> >> and I want to collect them by colour: >> >> accumulator = {'blue': [], 'green': [], 'red': []} >> for parrot in parrots: >> accumulator[parrot.colour].append(parrot) >> >> >> That's pretty compact and understandable, but it require mutating a bunch >> of pre-allocated lists inside an accumulator. Can we re-write this in a >> functional style? >> >> The obvious answer is "put it inside a function, then pretend it works by >> magic" but my colleague's reply to that is "Yes, but I'll know that its >> actually doing mutation inside the function". >> >> >> Help me humour my colleague. >> >> >> > > Hmm, isn't this just asking for itertools.groupby on the parrots sorted by > colour? That's one solution, but the sorting makes it O(n log n) for a task that should really just be O(n). From pavol.lisy at gmail.com Tue Jul 11 15:59:38 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Tue, 11 Jul 2017 21:59:38 +0200 Subject: ezdxf type of spline In-Reply-To: <75808a80-10bc-486c-b7bd-a3b4e7ca68fc@googlegroups.com> References: <75808a80-10bc-486c-b7bd-a3b4e7ca68fc@googlegroups.com> Message-ID: It seems to be: http://pythonhosted.org/ezdxf/entities.html?highlight=spline#Spline -> https://knowledge.autodesk.com/support/autocad/learn-explore/caas/CloudHelp/cloudhelp/2015/ENU/AutoCAD-Core/files/GUID-58316136-30EB-499C-ACAD-31D0C653B2B2-htm.html -> https://en.wikipedia.org/wiki/Non-uniform_rational_B-spline On 7/10/17, amka1791 at gmail.com wrote: > Hi, > > Can someone says please to me which kind are the splines of the ezdxf python > module ? Is it bezier curves ? > > Thanks, > > dylan > > -- > https://mail.python.org/mailman/listinfo/python-list > From cs at zip.com.au Tue Jul 11 19:36:40 2017 From: cs at zip.com.au (Cameron Simpson) Date: Wed, 12 Jul 2017 09:36:40 +1000 Subject: Better Regex and exception handling for this small code In-Reply-To: References: Message-ID: <20170711233640.GA22219@cskk.homeip.net> On 11Jul2017 22:01, Ganesh Pal wrote: >I am trying to open a file and check if there is a pattern has changed >after the task got completed? > >file data: >........................................................ > >#tail -f /file.txt >.......................................... >Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, >journal_crc = d2097b00 >Note: Task completed successfully. >Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, >journal_crc = a176af10 > > > I have the below piece of code but would like to make this better more >pythonic , I found regex pattern and exception handling poor here , any >quick suggestion in your spare time is welcome. > > >#open the existing file if the flag is set and check if there is a match > >log_file='/file.txt' >flag_is_on=1 Use "True" instead of "1". A flag is a Boolean thing, and should use a Boolean value. This lets you literally speak "true" and 'false" rather than imoplicitly saying that "0 means false and nonzero means true". >data = None There is no need to initialise data here because you immediately overwrite it below. >with open(log_file, 'r') as f: > data = f.readlines() > >if flag_is_on: Oh yes. Just name this variable "flag". "_is_on" is kind of implicit. > logdata = '\n'.join(data) Do other parts of your programme deal with the file data as lines? If not, there is little point to reading the file and breaking it up into lines above, then joining them together against here. Just go: with open(log_file) as f: log_data = f.read() > reg = "initiator_crc =(?P[\s\S]*?), journal_crc" Normally we write regular expressions as "raw" python strings, thus: reg = r'initiator_crc =(?P[\s\S]*?), journal_crc' because backslashes etc are punctuation inside normal strings. Within a "raw" string started with r' nothing is special until the closing ' character. This makes writing regular expressions more reliable. Also, why the character range "[\s\S]"? That says whitespace or nonwhitespace i.e. any character. If you want any character, just say ".". > crc = re.findall(re.compile(reg), logdata) It is better to compile a regexp just the once, getting a Regexp object, and then you just use the compiled object. > if not crc: > raise Exception("Pattern not found in logfile") ValueError would be a more appropriate exception here; plain old "Exception" is pretty vague. > checksumbefore = crc[0].strip() > checksumafter = crc[1].strip() Your regexp cannot start or end with whitespace. Those .strip calls are not doing anything for you. This reads like you expect there to be exactly 2 matches in the file. What if there are more or fewer? > logging.info("checksumbefore :%s and checksumafter:%s" > % (checksumbefore, checksumafter)) > > if checksumbefore == checksumafter: > raise Exception("checksum not macthing") Don't you mean != here? I wouldn't be raising exceptions in this code. Personally I would make this a function that returns True or False. Exceptions are a poor way of returning "status" or other values. They're really for "things that should not have happened", hence their name. It looks like you're scanning a log file for multiple lines and wanting to know if successive ones change. Why not write a function like this (untested): RE_CRC_LINE = re.compile(r'initiator_crc =(?P[\s\S]*?), journal_crc') def check_for_crc_changes(logfile): old_crc_text = '' with open(logfile) as f: for line in f: m = RE_CRC_LINE.match(line) if not m: # uninteresting line continue crc_text = m.group(0) if crc_text != old_crc_text: # found a change return True if old_crc_text == '': # if this is really an error, you might raise this exception # but maybe no such lines is just normal but boring raise ValueError("no CRC lines seen in logfile %r" % (logfile,)) # found no changes return False See that there is very little sanity checking. In an exception supporting language like Python you can often write code as if it will always succeed by using things which will raise exceptions if things go wrong. Then _outside_ the function you can catch any exceptions that occur (such as being unable to open the log file). Cheers, Cameron Simpson From steve+python at pearwood.info Tue Jul 11 20:50:32 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 12 Jul 2017 10:50:32 +1000 Subject: Test 0 and false since false is 0 References: Message-ID: <5965725a$0$1616$c3e8da3$5496439d@news.astraweb.com> On Tue, 11 Jul 2017 11:16 pm, Albert-Jan Roskam wrote: > >>> False == 0 > True > >>> False is 0 > False > > > =====> Just wondering: Is this 'is' test depending on an implementation detail > of cPython (small ints, I forgot how small 0-255 maybe, are singletons)? No. But the test 0 is 0 will be. True and False are guaranteed singletons: there will only ever be a single builtin False. The small ints including 0 *may* be cached, so that they will be singletons. The definition of "small" will vary from version to version. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From pavol.lisy at gmail.com Tue Jul 11 23:34:38 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Wed, 12 Jul 2017 05:34:38 +0200 Subject: Write this accumuator in a functional style In-Reply-To: <59646c01$0$11093$c3e8da3@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: On 7/11/17, Steven D'Aprano wrote: > I have a colleague who is allergic to mutating data structures. Yeah, I > know, he needs to just HTFU but I thought I'd humour him. > > Suppose I have an iterator that yields named tuples: > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > and I want to collect them by colour: > > accumulator = {'blue': [], 'green': [], 'red': []} > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? > > The obvious answer is "put it inside a function, then pretend it works by > magic" but my colleague's reply to that is "Yes, but I'll know that its > actually doing mutation inside the function". > > > Help me humour my colleague. This seems to be philosophical question to me: How to (or "what colleague could accept" or "what do you mean by") collect elements if not adding them to lists? From pavol.lisy at gmail.com Tue Jul 11 23:47:35 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Wed, 12 Jul 2017 05:47:35 +0200 Subject: Compiling Python 3.6.1 on macOS 10.12.5 In-Reply-To: References: Message-ID: On 7/10/17, Nigel Palmer wrote: > Hi > > I am trying to compile Python 3.6.1 on macOS 10.12.5 with xcode 8.8.3 using > the instructions at > https://docs.python.org/devguide/setup.html#build-dependencies but I am > getting the error > > ./python.exe -E -S -m sysconfig --generate-posix-vars ;\ > if test $? -ne 0 ; then \ > echo "generate-posix-vars failed" ; \ > rm -f ./pybuilddir.txt ; \ > exit 1 ; \ > fi > /bin/sh: line 1: 96973 Killed: 9 ./python.exe -E -S -m > sysconfig --generate-posix-vars > generate-posix-vars failed > make: *** [pybuilddir.txt] Error 1 > > When I manually run that command using dbll I get: > > lldb ./python.exe -- -E -S -m sysconfig --generate-posix-vars > (lldb) target create "./python.exe" > Current executable set to './python.exe' (x86_64). > (lldb) settings set -- target.run-args "-E" "-S" "-m" "sysconfig" > "--generate-posix-vars" > (lldb) r > Process 96978 launched: './python.exe' (x86_64) > Could not find platform dependent libraries > Consider setting $PYTHONHOME to [:] > Process 96978 exited with status = 0 (0x00000000) > (lldb) > > > The commands I ran to configure and build python are: > brew install openssl xz > CPPFLAGS="-I$(brew --prefix openssl)/include" LDFLAGS="-L$(brew --prefix > openssl)/lib" ./configure --prefix=`pwd`/../build > make > > Any ideas on what I am doing wrong? > > Many Thanks > Nigel > -- > https://mail.python.org/mailman/listinfo/python-list > Something like this is working? python -c "from distutils.sysconfig import get_config_var as gv; px = 'prefix' ;epx = 'exec_prefix' ;LIBDEST = 'LIBDEST' ;LD = 'LIBDIR' ;print(f'LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:{gv(LD)} PYTHONPATH=\$PYTHONPATH:{gv(LIBDEST)} PYTHONHOME=\$PYTHONHOME:{gv(px)}:{gv(epx)} ')" or maybe python.exe in your case -> python.exe -c "from distutils.sysconfig import get_config_var as gv; px = 'prefix' ;epx = 'exec_prefix' ;LIBDEST = 'LIBDEST' ;LD = 'LIBDIR' ;print(f'LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:{gv(LD)} PYTHONPATH=\$PYTHONPATH:{gv(LIBDEST)} PYTHONHOME=\$PYTHONHOME:{gv(px)}:{gv(epx)} ')" From steve+python at pearwood.info Wed Jul 12 00:18:00 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 12 Jul 2017 14:18:00 +1000 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: <5965a2fa$0$1622$c3e8da3$5496439d@news.astraweb.com> On Tue, 11 Jul 2017 04:58 pm, Chris Angelico wrote: > On Tue, Jul 11, 2017 at 4:11 PM, Steven D'Aprano wrote: [...] >> accumulator = {'blue': [], 'green': [], 'red': []} >> for parrot in parrots: >> accumulator[parrot.colour].append(parrot) [...] > It's a partitioning filter. (Three way, not the usual two, but same > same.) I've actually often wanted a quick way to write that - where > you divide a list into two according to "passes predicate" vs "fails > predicate". So if you find a really nice solution, I'm interested. That is one of my colleague's complaints too: he says this is a common task and there ought to be a built-in or at least std lib functional solution for it, akin to map/filter/itertools. Although I often say "not every two or three line function needs to be a built-in", in this case I'm inclined to agree with him. Now that I have a name for it, I am even more inclined to agree. It's an N-way partitioning filter. My example shows only three, but the real code we're using has more than that. def partition(iterable, keyfunc=bool): accumulator = {} for item in iterable: accumulator.setdefault(keyfunc(item), []).append(item) return accumulator Alternatives: - itertools.groupby requires you to sort the entire input stream before starting; that's expensive, O(N log N) rather than just O(N). - Greg's dict comprehension version requires N+1 passes through the data, one to convert to a list, and 1 per each possible key. - Terry's solution scares me :-) - Alain's solution appears to require list concatenation, which implies that in the worst case this will be O(N**2). Any other thoughts? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From greg.ewing at canterbury.ac.nz Wed Jul 12 01:47:50 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 12 Jul 2017 17:47:50 +1200 Subject: Write this accumuator in a functional style In-Reply-To: <5965a2fa$0$1622$c3e8da3$5496439d@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <5965a2fa$0$1622$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > - Greg's dict comprehension version requires N+1 passes through the data, > one to convert to a list, and 1 per each possible key. Just to be clear, my solution was a response to the requirement that it be written in a purely functional style. It's not now I would actually recommend doing it! While a purely functional single-pass solution is possible, in Python it would probably be just as inefficient, maybe even worse. -- Greg From rustompmody at gmail.com Wed Jul 12 03:32:16 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Wed, 12 Jul 2017 00:32:16 -0700 (PDT) Subject: Write this accumuator in a functional style In-Reply-To: <87vamzqmqz.fsf@universite-de-strasbourg.fr.invalid> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87vamzqmqz.fsf@universite-de-strasbourg.fr.invalid> Message-ID: On Tuesday, July 11, 2017 at 4:11:50 PM UTC+5:30, Alain Ketterlin wrote: > Steven D'Aprano writes: > > > I have a colleague who is allergic to mutating data structures. Yeah, I > > know, he needs to just HTFU but I thought I'd humour him. > > > > Suppose I have an iterator that yields named tuples: > > > > Parrot(colour='blue', species='Norwegian', status='tired and shagged out') > > > > and I want to collect them by colour: > > > > accumulator = {'blue': [], 'green': [], 'red': []} > > for parrot in parrots: > > accumulator[parrot.colour].append(parrot) > > > > > > That's pretty compact and understandable, but it require mutating a bunch > > of pre-allocated lists inside an accumulator. Can we re-write this in a > > functional style? > > Here is a sketch in OCaml-style (incomplete of course): > > type color = Blue | Green | Red;; > type parrot = { c: color; ... };; > > let rec collect list_of_parrots = > match list_of_parrots with > | nil -> (nil,nil,nil) > | h :: q -> > let b,g,r = collect q in > match h with > | {c=Blue} -> (h::b,g,r) > | {c=Green} -> (b,h::g,r) > | {c=Red} -> (b,g,h::r) > ;; Separating the recursion from the pattern-match-to-discriminate [Also its in haskell since I dont have an *ML handy] Code ------------- data Color = Red|Blue|Green deriving (Show) type Species = String type Status = String type Parrot = (Color, Species, Status) -- discriminating cons discons :: Parrot -> ([Parrot], [Parrot], [Parrot]) -> ([Parrot], [Parrot], [Parrot]) discons p@(Red,_,_) (r,g,b) = (p:r, g, b) discons p@(Green,_,_) (r,g,b) = (r, p:g, b) discons p@(Blue,_,_) (r,g,b) = (r, g, p:b) -- Loop disc :: [Parrot] -> ([Parrot], [Parrot], [Parrot]) disc = foldr discons ([],[],[]) ------------- Run: ------------- ?> let parrotlist = [(Blue, "norwe", "happy"), (Green, "austral", "tired"), (Red, "amer", "god-knows")] ?> disc parrotlist ([(Red,"amer","god-knows")],[(Green,"austral","tired")],[(Blue,"norwe","happy")]) ?> ----------------- Getting it into python would need a foldr (python's reduce is a foldl) There is an identity foldl op id l = foldr (flip op) id (reverse l) However for this we need the list to be a real (finite) list; not an iterator/infinite etc OTOH I suspect the spec as returning a bunch of lists is more likely to be a bunch of bags (Counter in python); in which case foldr can be replaced by foldl(reduce) From steve at pearwood.info Wed Jul 12 05:32:01 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 12 Jul 2017 09:32:01 GMT Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <5965a2fa$0$1622$c3e8da3$5496439d@news.astraweb.com> Message-ID: <5965ec90$0$11093$c3e8da3@news.astraweb.com> On Wed, 12 Jul 2017 17:47:50 +1200, Gregory Ewing wrote: > Steve D'Aprano wrote: >> - Greg's dict comprehension version requires N+1 passes through the >> data, >> one to convert to a list, and 1 per each possible key. > > Just to be clear, my solution was a response to the requirement that it > be written in a purely functional style. It's not now I would actually > recommend doing it! I never imagined anything else! -- Steve From no.email at nospam.invalid Wed Jul 12 05:52:22 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Wed, 12 Jul 2017 02:52:22 -0700 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> Message-ID: <87d196dlt5.fsf@nightsong.com> Steven D'Aprano writes: > for parrot in parrots: > accumulator[parrot.colour].append(parrot) > > That's pretty compact and understandable, but it require mutating a bunch > of pre-allocated lists inside an accumulator. Can we re-write this in a > functional style? Not so easily in Python since the built-in list and dict types are designed for mutation update. In Haskell, the list type is a linked list and the dictionary type is a balanced tree. So, you can make a new list consisting of a new item consed to the front of the old list, and you can make a new ("updated") dictionary by building O(log n) new nodes. You might like Chris Okasaki's wonderful book "Purely Functional Data Structures" that explains all this and more. From christoph.macho at gmx.at Wed Jul 12 06:14:25 2017 From: christoph.macho at gmx.at (christoph) Date: Wed, 12 Jul 2017 11:14:25 +0100 Subject: pythonpath in ipython console Message-ID: <5dcb1546-63cb-6fab-6933-b97224501a1b@gmx.at> hello, I am a bit confused, i use spyder, when i execute in ipython console program start fails with message 'Attribute error' when I start same program via python console everything works fine, even start from terminal workes fine. It seems that i python does not load Pythonpath, although wdir= set to execution path and all the modules are stored in that directory. How can I avoid this error? cheers From rosuav at gmail.com Wed Jul 12 06:19:34 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 12 Jul 2017 20:19:34 +1000 Subject: Write this accumuator in a functional style In-Reply-To: <87d196dlt5.fsf@nightsong.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> Message-ID: On Wed, Jul 12, 2017 at 7:52 PM, Paul Rubin wrote: > Not so easily in Python since the built-in list and dict types are > designed for mutation update. In Haskell, the list type is a linked > list and the dictionary type is a balanced tree. So, you can make a new > list consisting of a new item consed to the front of the old list, and > you can make a new ("updated") dictionary by building O(log n) new > nodes. Is that actual O(log n), or amortized? If you build a tree by forever inserting larger values (which can happen easily - imagine using a dict to record hourly statistics, using time.time()//3600 as the key), at some point it'll need to be rebalanced, which could at worst case be O(n). But I could believe that it's amortized logarithmic, although I can't myself prove that it is. ChrisA From rhodri at kynesim.co.uk Wed Jul 12 07:35:06 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 12 Jul 2017 12:35:06 +0100 Subject: Test 0 and false since false is 0 In-Reply-To: References: Message-ID: <8df7a2c0-dd67-8476-aa6e-84af1c3148c0@kynesim.co.uk> On 12/07/17 03:29, Stefan Ram wrote: > Grant Edwards writes: >> False is required to be a singleton. > > ?singleton? usually means ?the sole object of its class?. > > ?Ensure a class only has one instance, and provide a > global point of access to it.? - Gamma et al. We are using the term differently. We are vast, we contain multitudes, etc, etc. >>>> type( False ) > > >>>> type( True ) > > > It seems, ?False? is not a singleton under the > implementation of Python I used. The point that was being made is that there are no other bools than True and False, and they are distinct from the objects 1 and 0. -- Rhodri James *-* Kynesim Ltd From lunkambamuk at gmail.com Wed Jul 12 08:31:10 2017 From: lunkambamuk at gmail.com (lunkambamuk at gmail.com) Date: Wed, 12 Jul 2017 05:31:10 -0700 (PDT) Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define Message-ID: class Person: def __init__(self, name, job=None, pay=0): self.name = name self.job = job self.pay = pay def lastName(self): return self.name.split()[-1] def giveRaise(self, percent): self.pay = int(self.pay * (1 + percent)) def __repr__(self): return '[Person: %s, %s]' % (self.name, self.pay) class Manager(Person): def giveraise(self, percent, bonus=.10): Person.giveRaise(self, percent + bonus) if __name__ == '__main__': #self-test code bob = Person('Bob Smith') sue = Person('Sue Jones', job='dev', pay=100000) print(bob) print(sue) print(bob.lastName(), sue.lastName()) sue.giveRaise(.10) print(sue.pay) tom = Manager('Tom Jones', 'mgr', 50000) tom.giveRaise(.10) print(tom.lastName()) print(tom) From steve+python at pearwood.info Wed Jul 12 08:50:10 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 12 Jul 2017 22:50:10 +1000 Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define References: Message-ID: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> Please COPY AND PASTE the FULL error, starting with the line "Traceback". The code you show below looks fine, and you don't need an import, so I don't know what error you are getting. On Wed, 12 Jul 2017 10:31 pm, lunkambamuk at gmail.com wrote: > class Person: > def __init__(self, name, job=None, pay=0): > self.name = name > self.job = job > self.pay = pay > def lastName(self): > return self.name.split()[-1] > def giveRaise(self, percent): > self.pay = int(self.pay * (1 + percent)) > def __repr__(self): > return '[Person: %s, %s]' % (self.name, self.pay) > class Manager(Person): > def giveraise(self, percent, bonus=.10): > Person.giveRaise(self, percent + bonus) > > if __name__ == '__main__': > #self-test code > bob = Person('Bob Smith') > sue = Person('Sue Jones', job='dev', pay=100000) > print(bob) > print(sue) > print(bob.lastName(), sue.lastName()) > sue.giveRaise(.10) > print(sue.pay) > tom = Manager('Tom Jones', 'mgr', 50000) > tom.giveRaise(.10) > print(tom.lastName()) > print(tom) -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From lunkambamuk at gmail.com Wed Jul 12 09:08:42 2017 From: lunkambamuk at gmail.com (WoFy The 95s) Date: Wed, 12 Jul 2017 06:08:42 -0700 (PDT) Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define In-Reply-To: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> References: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> Message-ID: <06c1a06f-25f5-46a0-92cf-37b45ea2e864@googlegroups.com> On Wednesday, 12 July 2017 18:20:32 UTC+5:30, Steve D'Aprano wrote: > Please COPY AND PASTE the FULL error, starting with the line "Traceback". > > The code you show below looks fine, and you don't need an import, so I don't > know what error you are getting. > > > On Wed, 12 Jul 2017 10:31 pm, lunkambamuk at gmail.com wrote: > > > class Person: > > def __init__(self, name, job=None, pay=0): > > self.name = name > > self.job = job > > self.pay = pay > > def lastName(self): > > return self.name.split()[-1] > > def giveRaise(self, percent): > > self.pay = int(self.pay * (1 + percent)) > > def __repr__(self): > > return '[Person: %s, %s]' % (self.name, self.pay) > > class Manager(Person): > > def giveraise(self, percent, bonus=.10): > > Person.giveRaise(self, percent + bonus) > > > > if __name__ == '__main__': > > #self-test code > > bob = Person('Bob Smith') > > sue = Person('Sue Jones', job='dev', pay=100000) > > print(bob) > > print(sue) > > print(bob.lastName(), sue.lastName()) > > sue.giveRaise(.10) > > print(sue.pay) > > tom = Manager('Tom Jones', 'mgr', 50000) > > tom.giveRaise(.10) > > print(tom.lastName()) > > print(tom) > > -- > Steve > ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure > enough, things got worse. i tried from idle interpreter from person import Manager >>> from person import Manager Traceback (most recent call last): File "", line 1, in from person import Manager ImportError: cannot import name 'Manager' and this also >>> import person >>> tom = Manager('parias lunkamaba', 'mgr', 500000) then i get this Traceback (most recent call last): File "", line 1, in tom=Manager('Tome Jones', 'mgr', 50000) NameError: name 'Manager' is not defined From lunkambamuk at gmail.com Wed Jul 12 09:10:14 2017 From: lunkambamuk at gmail.com (WoFy The 95s) Date: Wed, 12 Jul 2017 06:10:14 -0700 (PDT) Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define In-Reply-To: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> References: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> Message-ID: <95169a0c-a6a7-44fe-87e6-a1a86ab32951@googlegroups.com> On Wednesday, 12 July 2017 18:20:32 UTC+5:30, Steve D'Aprano wrote: > Please COPY AND PASTE the FULL error, starting with the line "Traceback". > > The code you show below looks fine, and you don't need an import, so I don't > know what error you are getting. > > > On Wed, 12 Jul 2017 10:31 pm, lunkambamuk at gmail.com wrote: > > > class Person: > > def __init__(self, name, job=None, pay=0): > > self.name = name > > self.job = job > > self.pay = pay > > def lastName(self): > > return self.name.split()[-1] > > def giveRaise(self, percent): > > self.pay = int(self.pay * (1 + percent)) > > def __repr__(self): > > return '[Person: %s, %s]' % (self.name, self.pay) > > class Manager(Person): > > def giveraise(self, percent, bonus=.10): > > Person.giveRaise(self, percent + bonus) > > > > if __name__ == '__main__': > > #self-test code > > bob = Person('Bob Smith') > > sue = Person('Sue Jones', job='dev', pay=100000) > > print(bob) > > print(sue) > > print(bob.lastName(), sue.lastName()) > > sue.giveRaise(.10) > > print(sue.pay) > > tom = Manager('Tom Jones', 'mgr', 50000) > > tom.giveRaise(.10) > > print(tom.lastName()) > > print(tom) > > -- > Steve > ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure > enough, things got worse. only in python 3.5.3 From thatebart at gmail.com Wed Jul 12 09:18:51 2017 From: thatebart at gmail.com (Bart Thate) Date: Wed, 12 Jul 2017 06:18:51 -0700 (PDT) Subject: botlib - framework to program bots Message-ID: <5e54af8b-ccf4-4fb7-a9b4-f6665264261e@googlegroups.com> BOTLIB - Framework to program bots is released in the Public Domain - https://lnkd.in/ginB49K #publicdomain #python3 #xmpp #irc #bot Framework to program bots. From __peter__ at web.de Wed Jul 12 09:26:28 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 12 Jul 2017 15:26:28 +0200 Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define References: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> <06c1a06f-25f5-46a0-92cf-37b45ea2e864@googlegroups.com> Message-ID: WoFy The 95s wrote: > i tried from idle interpreter > > from person import Manager > > > >>>> from person import Manager > Traceback (most recent call last): > File "", line 1, in > from person import Manager > ImportError: cannot import name 'Manager' Enter import person person.__file__ in idle's shell. There may be another file called person.py which is imported instead of the one you intended. > and this also >>>> import person > >>>> tom = Manager('parias lunkamaba', 'mgr', 500000) > > then i get this > > > Traceback (most recent call last): > File "", line 1, in > tom=Manager('Tome Jones', 'mgr', 50000) > NameError: name 'Manager' is not defined This is standard behaviour. If after import person you want to access a name in the person module you have to use the qualified name, e. g. tom = person.Manager('parias lunkamaba', 'mgr', 500000) From lunkambamuk at gmail.com Wed Jul 12 11:19:01 2017 From: lunkambamuk at gmail.com (WoFy The 95s) Date: Wed, 12 Jul 2017 08:19:01 -0700 (PDT) Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define In-Reply-To: References: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> <06c1a06f-25f5-46a0-92cf-37b45ea2e864@googlegroups.com> Message-ID: <5e5cdce1-3341-4cdb-b6de-3cb09954a0d4@googlegroups.com> On Wednesday, 12 July 2017 18:57:11 UTC+5:30, Peter Otten wrote: > WoFy The 95s wrote: > > > i tried from idle interpreter > > > > from person import Manager > > > > > > > >>>> from person import Manager > > Traceback (most recent call last): > > File "", line 1, in > > from person import Manager > > ImportError: cannot import name 'Manager' > > > Enter > > import person > person.__file__ > > in idle's shell. There may be another file called person.py which is > imported instead of the one you intended. > > > > and this also > >>>> import person > > > >>>> tom = Manager('parias lunkamaba', 'mgr', 500000) > > > > then i get this > > > > > > Traceback (most recent call last): > > File "", line 1, in > > tom=Manager('Tome Jones', 'mgr', 50000) > > NameError: name 'Manager' is not defined > > This is standard behaviour. If after > > import person > > you want to access a name in the person module you have to use the qualified > name, e. g. > > tom = person.Manager('parias lunkamaba', 'mgr', 500000) >>>import person >>>tom = person.Manager('Parias lunkamba', 'mgr', 500000) >>>Traceback (most recent call last): File "", line 1, in tom = person.Manager('parias lunkamba', 'mgr', 500000) AttributeError: module 'person' has no attribute 'Manager' why the module in python 3.5 doesn't recognize the Manager class? From __peter__ at web.de Wed Jul 12 11:55:54 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 12 Jul 2017 17:55:54 +0200 Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define References: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> <06c1a06f-25f5-46a0-92cf-37b45ea2e864@googlegroups.com> <5e5cdce1-3341-4cdb-b6de-3cb09954a0d4@googlegroups.com> Message-ID: WoFy The 95s wrote: > On Wednesday, 12 July 2017 18:57:11 UTC+5:30, Peter Otten wrote: >> WoFy The 95s wrote: >> >> > i tried from idle interpreter >> > >> > from person import Manager >> > >> > >> > >> >>>> from person import Manager >> > Traceback (most recent call last): >> > File "", line 1, in >> > from person import Manager >> > ImportError: cannot import name 'Manager' >> >> >> Enter >> >> import person >> person.__file__ >> >> in idle's shell. There may be another file called person.py which is >> imported instead of the one you intended. >> >> >> > and this also >> >>>> import person >> > >> >>>> tom = Manager('parias lunkamaba', 'mgr', 500000) >> > >> > then i get this >> > >> > >> > Traceback (most recent call last): >> > File "", line 1, in >> > tom=Manager('Tome Jones', 'mgr', 50000) >> > NameError: name 'Manager' is not defined >> >> This is standard behaviour. If after >> >> import person >> >> you want to access a name in the person module you have to use the >> qualified name, e. g. >> >> tom = person.Manager('parias lunkamaba', 'mgr', 500000) > > > >>>>import person >>>>tom = person.Manager('Parias lunkamba', 'mgr', 500000) >>>>Traceback (most recent call last): > File "", line 1, in > tom = person.Manager('parias lunkamba', 'mgr', 500000) > AttributeError: module 'person' has no attribute 'Manager' > > why the module in python 3.5 doesn't recognize the Manager class? Again, you probably have two files called person.py, and the one you import does not contain a Manager class. (Different directories in sys.path can explain why different person.py files will be imported in differing python versions) As I said, you can find out the file's location with >>> person.__file__ or have a look at the module content with >>> import inspect >>> import person >>> print(inspect.getsource(person)) From lunkambamuk at gmail.com Wed Jul 12 12:13:54 2017 From: lunkambamuk at gmail.com (WoFy The 95s) Date: Wed, 12 Jul 2017 09:13:54 -0700 (PDT) Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define In-Reply-To: References: Message-ID: <0d11b224-b005-4f7b-8999-34035dde5d14@googlegroups.com> On Wednesday, 12 July 2017 18:01:35 UTC+5:30, WoFy The 95s wrote: > class Person: > def __init__(self, name, job=None, pay=0): > self.name = name > self.job = job > self.pay = pay > def lastName(self): > return self.name.split()[-1] > def giveRaise(self, percent): > self.pay = int(self.pay * (1 + percent)) > def __repr__(self): > return '[Person: %s, %s]' % (self.name, self.pay) > class Manager(Person): > def giveraise(self, percent, bonus=.10): > Person.giveRaise(self, percent + bonus) > > if __name__ == '__main__': > #self-test code > bob = Person('Bob Smith') > sue = Person('Sue Jones', job='dev', pay=100000) > print(bob) > print(sue) > print(bob.lastName(), sue.lastName()) > sue.giveRaise(.10) > print(sue.pay) > tom = Manager('Tom Jones', 'mgr', 50000) > tom.giveRaise(.10) > print(tom.lastName()) > print(tom) i removed some space and it worked thanks a lot From tjreedy at udel.edu Wed Jul 12 16:35:10 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 12 Jul 2017 16:35:10 -0400 Subject: Test 0 and false since false is 0 In-Reply-To: <8df7a2c0-dd67-8476-aa6e-84af1c3148c0@kynesim.co.uk> References: <8df7a2c0-dd67-8476-aa6e-84af1c3148c0@kynesim.co.uk> Message-ID: On 7/12/2017 7:35 AM, Rhodri James wrote: > On 12/07/17 03:29, Stefan Ram wrote: >> Grant Edwards writes: >>> False is required to be a singleton. >> >> ?singleton? usually means ?the sole object of its class?. >> >> ?Ensure a class only has one instance, and provide a >> global point of access to it.? - Gamma et al. >>>>> type( False ) >> >> >>>>> type( True ) >> >> >> It seems, ?False? is not a singleton under the >> implementation of Python I used. > > The point that was being made is that there are no other bools than True > and False, and they are distinct from the objects 1 and 0. By analogy with 'singleton', True and False constitute a 'doubleton' in the sense of being the sole 2 objects of class Bool. -- Terry Jan Reedy From thebalancepro at gmail.com Wed Jul 12 18:49:52 2017 From: thebalancepro at gmail.com (Nick Mellor) Date: Wed, 12 Jul 2017 15:49:52 -0700 (PDT) Subject: Better Regex and exception handling for this small code In-Reply-To: References: Message-ID: On Wednesday, 12 July 2017 02:32:29 UTC+10, Ganesh Pal wrote: > Dear Python friends > > I am trying to open a file and check if there is a pattern has changed > after the task got completed? > > file data: > ........................................................ > > #tail -f /file.txt > .......................................... > Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, > journal_crc = d2097b00 > Note: Task completed successfully. > Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, > journal_crc = a176af10 > > > I have the below piece of code but would like to make this better more > pythonic , I found regex pattern and exception handling poor here , any > quick suggestion in your spare time is welcome. > > > #open the existing file if the flag is set and check if there is a match > > log_file='/file.txt' > flag_is_on=1 > > data = None > with open(log_file, 'r') as f: > data = f.readlines() > > > if flag_is_on: > logdata = '\n'.join(data) > reg = "initiator_crc =(?P[\s\S]*?), journal_crc" > crc = re.findall(re.compile(reg), logdata) > if not crc: > raise Exception("Pattern not found in logfile") > > checksumbefore = crc[0].strip() > checksumafter = crc[1].strip() > logging.info("checksumbefore :%s and checksumafter:%s" > % (checksumbefore, checksumafter)) > > if checksumbefore == checksumafter: > raise Exception("checksum not macthing") > > I am on Linux and Python 2.7 > > Regards, > Ganesh There's not much need to compile regexes unless you've got *a lot* of them in your code. The first ones are automatically compiled and cached: https://stackoverflow.com/questions/452104/is-it-worth-using-pythons-re-compile Cheers, Nick From python at mrabarnett.plus.com Wed Jul 12 20:49:31 2017 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 13 Jul 2017 01:49:31 +0100 Subject: Better Regex and exception handling for this small code In-Reply-To: References: Message-ID: <9987ca07-1eca-6f46-ce4e-86a5fdf997be@mrabarnett.plus.com> On 2017-07-12 23:49, Nick Mellor wrote: > On Wednesday, 12 July 2017 02:32:29 UTC+10, Ganesh Pal wrote: >> Dear Python friends >> >> I am trying to open a file and check if there is a pattern has changed >> after the task got completed? >> >> file data: >> ........................................................ >> >> #tail -f /file.txt >> .......................................... >> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, >> journal_crc = d2097b00 >> Note: Task completed successfully. >> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, >> journal_crc = a176af10 >> >> >> I have the below piece of code but would like to make this better more >> pythonic , I found regex pattern and exception handling poor here , any >> quick suggestion in your spare time is welcome. >> >> >> #open the existing file if the flag is set and check if there is a match >> >> log_file='/file.txt' >> flag_is_on=1 >> >> data = None >> with open(log_file, 'r') as f: >> data = f.readlines() >> >> >> if flag_is_on: >> logdata = '\n'.join(data) >> reg = "initiator_crc =(?P[\s\S]*?), journal_crc" >> crc = re.findall(re.compile(reg), logdata) >> if not crc: >> raise Exception("Pattern not found in logfile") >> >> checksumbefore = crc[0].strip() >> checksumafter = crc[1].strip() >> logging.info("checksumbefore :%s and checksumafter:%s" >> % (checksumbefore, checksumafter)) >> >> if checksumbefore == checksumafter: >> raise Exception("checksum not macthing") >> >> I am on Linux and Python 2.7 >> >> Regards, >> Ganesh > > There's not much need to compile regexes unless you've got *a lot* of them in your code. The first ones are automatically compiled and cached: > > https://stackoverflow.com/questions/452104/is-it-worth-using-pythons-re-compile > I think this is the first time that I've seen someone pass a compiled pattern into re.findall. The usual way is to pass the pattern as a string: crc = re.findall(reg, logdata) If you have a lot of them, or it's in a loop that'll iterate many times, it'll be quicker if you compile it first (outside the loop): pattern = re.compile(reg) and then use the compiled pattern's .findall method: crc = pattern.findall(logdata) From Dahui.Jiang at veritas.com Wed Jul 12 23:12:23 2017 From: Dahui.Jiang at veritas.com (Dahui Jiang) Date: Thu, 13 Jul 2017 03:12:23 +0000 Subject: No pip3.5 bin after install python3.5.1 from source code Message-ID: <029fbe66350c40c6bd7a9f5290e838ae@vrtsxchclupin16.community.veritas.com> Hi all: I installed python3.5.1 from source code, but found strange events. on RHEL6, before "make install", if I run "yum -y install openssl openssl-devel" even though the two lib is installed, then the pip bin would installed, else not installed; on SLES11 SP3, even though I run "rpm -i libopenssl-devel-0.9.8j-2.1.x86_64.rpm" before "make install", pip bin still can't be installed, but after install python3, I downloaded source code of pip 7.1.2, and also installed pip from source, and it is installed successfully. Why? Any have met the same problem? Thanks Dahui From no.email at nospam.invalid Thu Jul 13 04:15:17 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Thu, 13 Jul 2017 01:15:17 -0700 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> Message-ID: <87zic8pxbe.fsf@nightsong.com> Chris Angelico writes: > some point it'll need to be rebalanced, which could at worst case > be O(n). No, you use a structure like an AVL tree or red-black tree, so it's within a constant factor of balanced after each insertion. You rewrite O(log n) of the nodes, and juggle around a constant number of them at the top of the tree. The Wikipedia articles about those data structures are pretty good. C++ std::map is also implemented that way, I think. From rosuav at gmail.com Thu Jul 13 04:46:51 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 13 Jul 2017 18:46:51 +1000 Subject: Write this accumuator in a functional style In-Reply-To: <87zic8pxbe.fsf@nightsong.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> Message-ID: On Thu, Jul 13, 2017 at 6:15 PM, Paul Rubin wrote: > Chris Angelico writes: >> some point it'll need to be rebalanced, which could at worst case >> be O(n). > > No, you use a structure like an AVL tree or red-black tree, so it's > within a constant factor of balanced after each insertion. You rewrite > O(log n) of the nodes, and juggle around a constant number of them at > the top of the tree. The Wikipedia articles about those data structures > are pretty good. C++ std::map is also implemented that way, I think. Sure, that deals with the algorithmic complexity, but that would entail a lot more rebalancing work, and if everything's immutable, that means reconstructing the tree more often, right? Maybe I'm completely on the wrong track here, but the last time I implemented a self-balancing tree, it usually involved a fair amount of mutation. ChrisA From no.email at nospam.invalid Thu Jul 13 05:48:22 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Thu, 13 Jul 2017 02:48:22 -0700 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> Message-ID: <87pod4fz15.fsf@nightsong.com> Chris Angelico writes: > Maybe I'm completely on the wrong track here, but the last time I > implemented a self-balancing tree, it usually involved a fair amount > of mutation. AVL trees are fairly simple to implement without mutation. Red-black trees are traditionally implemented with mutation, inserting by making nodes mis-colored, then going and re-coloring them. But they can be done mutation-free as well. Here's an amazing Haskell implementation where the tree invariants are encoded in the datatype: https://gist.github.com/rampion/2659812 Reddit discussion of above: https://redd.it/ti5il More recent versions of GHC make the type signatures even nicer, since you can put numbers directly into types without that nested type encoding. From marko at pacujo.net Thu Jul 13 06:25:58 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 13 Jul 2017 13:25:58 +0300 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> Message-ID: <8760ewws3t.fsf@elektro.pacujo.net> Paul Rubin : > Chris Angelico writes: >> Maybe I'm completely on the wrong track here, but the last time I >> implemented a self-balancing tree, it usually involved a fair amount >> of mutation. > > AVL trees are fairly simple to implement without mutation. Red-black > trees are traditionally implemented with mutation, inserting by making > nodes mis-colored, then going and re-coloring them. But they can be > done mutation-free as well. Simple, yes, but is the worst case insertion/deletion time still within O(log n)? Marko From rustompmody at gmail.com Thu Jul 13 08:10:45 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Thu, 13 Jul 2017 05:10:45 -0700 (PDT) Subject: Write this accumuator in a functional style In-Reply-To: <8760ewws3t.fsf@elektro.pacujo.net> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> Message-ID: <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> Marko wrote: > Simple, yes, but is the worst case > insertion/deletion time still within > O(log n)? Good point; and needs to be applied to Steven's append-using OP as well Yeah I know append method is supposedly O(1). I find that surprising... More so when the article https://wiki.python.org/moin/TimeComplexity talks of average case Vs amortized-worst case(!) Whatever does that mean? From steve+python at pearwood.info Thu Jul 13 09:06:36 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Thu, 13 Jul 2017 23:06:36 +1000 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> Message-ID: <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> On Thu, 13 Jul 2017 10:10 pm, Rustom Mody wrote: > Yeah I know append method is supposedly O(1). > I find that surprising... > More so when the article > https://wiki.python.org/moin/TimeComplexity > talks of average case Vs amortized-worst case(!) Whatever does that mean? "Average case" refers to the average over the possible data values you are dealing with. For example, if you ask for the average cost of inserting into a list: mylist.insert(i, obj) you average over i=0, i=1, i=2, ..., i=len(mylist). "Worst case" refers to the most expensive case, which in the case of inserting into a list, will be mylist.insert(0, obj). "Amortized" refers to averaging over many repeated operations. For example, start with an empty list, and keep inserting over and over again: mylist = [] for i in range(10000000): # ideally, N -> ? mylist.insert(0, obj) That's equivalent to averaging over: [].insert(0, obj) [1].insert(0, obj) [1, 1].insert(0, obj) [1, 1, 1].insert(0, obj) etc. The average cost of each of those insertions is the amortized cost. Amortized costs are relevant where the cost of an operation is usually cheap, but occasionally you hit a special case which makes it expensive. In the case of list.append, lists are over-allocated: mylist = [1, 2, 3] # allocates (let's say) 50 slots, but only 3 are in use Appending to an over-allocated list takes constant time, because that's just writing to a slot in an array. So we can append to mylist 47 times before the array is full, then we get an expensive resize: mylist 50 slots -> 100 slots That's very costly, proportional to the size of the list, but it doesn't happen every append. After the resize, we can do another 50 appends before needing to do it again. CPython's allocation strategy for lists doubles[1] the list on each resize, so the worst case is that your list is 50% (plus one item) full, and you're using twice as much memory as needed. But the advantage is that resizes happen less and less frequently: after each resize, it takes twice as many appends before you need to do the next. If you do the calculations, each resize is twice as expensive as the last, but happens half as frequently, so the cost of those resizes are amortized out over all the appends to be a constant (and relatively small) cost. [1] Actually, CPython's lists initially quadruple the size of the array, up to a certain point, and then switch to doubling. This ensures that small lists have even fewer expensive resizes, at the cost of wasting a bit more memory, but its only a small array so who cares? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rhodri at kynesim.co.uk Thu Jul 13 09:23:33 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 13 Jul 2017 14:23:33 +0100 Subject: python 3.5 raiaing an error when import the class Manager in this module sayning name Manager is not define In-Reply-To: <5e5cdce1-3341-4cdb-b6de-3cb09954a0d4@googlegroups.com> References: <59661b04$0$1597$c3e8da3$5496439d@news.astraweb.com> <06c1a06f-25f5-46a0-92cf-37b45ea2e864@googlegroups.com> <5e5cdce1-3341-4cdb-b6de-3cb09954a0d4@googlegroups.com> Message-ID: On 12/07/17 16:19, WoFy The 95s wrote: > On Wednesday, 12 July 2017 18:57:11 UTC+5:30, Peter Otten wrote: >> WoFy The 95s wrote: >> >>> i tried from idle interpreter >>> >>> from person import Manager >>> >>> >>> >>>>>> from person import Manager >>> Traceback (most recent call last): >>> File "", line 1, in >>> from person import Manager >>> ImportError: cannot import name 'Manager' >> >> >> Enter >> >> import person >> person.__file__ >> >> in idle's shell. There may be another file called person.py which is >> imported instead of the one you intended. [snip] > >>>> import person >>>> tom = person.Manager('Parias lunkamba', 'mgr', 500000) >>>> Traceback (most recent call last): > File "", line 1, in > tom = person.Manager('parias lunkamba', 'mgr', 500000) > AttributeError: module 'person' has no attribute 'Manager' > > why the module in python 3.5 doesn't recognize the Manager class? It does for me. Have you tried Peter's suggestion of typing >>> import person >>> person.__file__ into IDLE? It is very likely that you are not picking up the "person.py" that you think you are. -- Rhodri James *-* Kynesim Ltd From pavol.lisy at gmail.com Thu Jul 13 10:59:21 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Thu, 13 Jul 2017 16:59:21 +0200 Subject: Write this accumuator in a functional style In-Reply-To: <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 7/13/17, Steve D'Aprano wrote: > [1] Actually, CPython's lists initially quadruple the size of the array, up > to a > certain point, and then switch to doubling. This ensures that small lists > have > even fewer expensive resizes, at the cost of wasting a bit more memory, but > its > only a small array so who cares? IMHO problem is doubling size for huge lists. Or waste big memory for huge frozensets. I mean resize it to 2*N if its size is just N+1. From rustompmody at gmail.com Thu Jul 13 11:09:42 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Thu, 13 Jul 2017 08:09:42 -0700 (PDT) Subject: Write this accumuator in a functional style In-Reply-To: References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> Message-ID: <3ce1cef0-7bd4-433d-9835-bbfb6df691f8@googlegroups.com> Pavol Lisy wrote: > IMHO problem is doubling size for huge lists. > Or waste big memory for huge frozensets. I mean resize it to 2*N if > its size is just N+1. Couple that with the fact that space-time are not unrelated on any modern VM based OS + cache based hw. Doubly so for "managed" languages where gc buys space for time. From sanky8793 at gmail.com Thu Jul 13 11:42:04 2017 From: sanky8793 at gmail.com (sanky8793 at gmail.com) Date: Thu, 13 Jul 2017 08:42:04 -0700 (PDT) Subject: Process in not get killed using subprocess.call() in python thread Message-ID: I have created one thread in python, and that thread is running in infinite loop, but when I was trying to kill a process by making use of subprocess.call("my ps command") Its not actually working Here is the code, import threading import subprocess def B(): while True: cmd="ps -ef | grep 'shell.py --server' | awk '{print $2}' | xargs kill -9" subprocess.call(cmd, shell=True) def A(): th = threading.Thread(target=B) th.start() In above example, subprocess.call() getting executed but not actually killing the process that I want. If I executed command manually then its working fine, but in thread its not. From rhodri at kynesim.co.uk Thu Jul 13 11:59:41 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 13 Jul 2017 16:59:41 +0100 Subject: Write this accumuator in a functional style In-Reply-To: <3ce1cef0-7bd4-433d-9835-bbfb6df691f8@googlegroups.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> <3ce1cef0-7bd4-433d-9835-bbfb6df691f8@googlegroups.com> Message-ID: <71e4d45b-feff-8521-2510-1995c3029428@kynesim.co.uk> On 13/07/17 16:09, Rustom Mody wrote: > Pavol Lisy wrote: >> IMHO problem is doubling size for huge lists. > >> Or waste big memory for huge frozensets. I mean resize it to 2*N if >> its size is just N+1. > > Couple that with the fact that space-time are not unrelated on any modern VM based OS + cache based hw. Doubly so for "managed" languages where gc buys space for time. > You might want to do some benchmarks to sound out that idea. I believe conventional wisdom is that the time cost of allocating more memory and extending the list outweighs the space cost of wasted memory. -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Thu Jul 13 12:09:29 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 13 Jul 2017 17:09:29 +0100 Subject: Process in not get killed using subprocess.call() in python thread In-Reply-To: References: Message-ID: On 13/07/17 16:42, sanky8793 at gmail.com wrote: > > I have created one thread in python, and that thread is running in infinite loop, but when I was trying to kill a process by making use of subprocess.call("my ps command") Its not actually working > > Here is the code, > > > > > import threading > import subprocess > > def B(): > while True: > cmd="ps -ef | grep 'shell.py --server' | awk '{print $2}' | xargs kill -9" > subprocess.call(cmd, shell=True) > > > def A(): > th = threading.Thread(target=B) > th.start() > > > > > > In above example, subprocess.call() getting executed but not actually killing the process that I want. If I executed command manually then its working fine, but in thread its not. Have you tried checking the return code from subprocessor.call(), or sending the subprocess's stdout and stderr to file? It's probably best if you don't run the thread in an infinite loop when you do that. Why are you doing that anyway? -- Rhodri James *-* Kynesim Ltd From steve+python at pearwood.info Thu Jul 13 12:26:36 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 14 Jul 2017 02:26:36 +1000 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> <3ce1cef0-7bd4-433d-9835-bbfb6df691f8@googlegroups.com> Message-ID: <59679f3e$0$1593$c3e8da3$5496439d@news.astraweb.com> On Fri, 14 Jul 2017 01:09 am, Rustom Mody wrote: > Couple that with the fact that space-time are not unrelated on any modern VM > based OS + cache based hw. Doubly so for "managed" languages where gc buys > space for time. I don't understand that comment. Space/time have *never* been unrelated. Trading off space for time (you can save memory by doing extra work, which takes time, or save time by using more memory) is an old, old trick. It applies to 1950s mainframes just as much as 2010s smart phones, and everything in between. It isn't even specific to computers: what do you think a book index is, except a way to spend extra space (more pages) to save time when looking up a topic? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Thu Jul 13 12:31:53 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 14 Jul 2017 02:31:53 +1000 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> Message-ID: <5967a07b$0$1609$c3e8da3$5496439d@news.astraweb.com> On Fri, 14 Jul 2017 12:59 am, Pavol Lisy wrote: > On 7/13/17, Steve D'Aprano wrote: > >> [1] Actually, CPython's lists initially quadruple the size of the array, up >> to a >> certain point, and then switch to doubling. This ensures that small lists >> have >> even fewer expensive resizes, at the cost of wasting a bit more memory, but >> its >> only a small array so who cares? > > IMHO problem is doubling size for huge lists. If your list is so huge that a factor of two makes a difference, then you should be using a different data structure. Something that doesn't live in memory all at once. And by huge, I mean hundreds of millions or billions of items. But under normal circumstances, a factor of two is nothing. (Unless you're working on an embedded device, where memory is really constrained. But then you shouldn't be using CPython. Try MicroPython.) It's not 1970 where 64K is more memory than anyone can imagine using and we have to count every byte of memory. Its 2017 and memory is cheap, we can afford to use some memory to speed up our algorithms. > Or waste big memory for huge frozensets. I mean resize it to 2*N if > its size is just N+1. Frozensets aren't mutable, so they never resize. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From pavol.lisy at gmail.com Thu Jul 13 13:16:34 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Thu, 13 Jul 2017 19:16:34 +0200 Subject: Write this accumuator in a functional style In-Reply-To: <5967a07b$0$1609$c3e8da3$5496439d@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> <5967a07b$0$1609$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 7/13/17, Steve D'Aprano wrote: > On Fri, 14 Jul 2017 12:59 am, Pavol Lisy wrote: > >> On 7/13/17, Steve D'Aprano wrote: >> >>> [1] Actually, CPython's lists initially quadruple the size of the array, >>> up >>> to a >>> certain point, and then switch to doubling. This ensures that small lists >>> have >>> even fewer expensive resizes, at the cost of wasting a bit more memory, >>> but >>> its >>> only a small array so who cares? >> >> IMHO problem is doubling size for huge lists. > > If your list is so huge that a factor of two makes a difference, then you > should > be using a different data structure. Something that doesn't live in memory > all > at once. And by huge, I mean hundreds of millions or billions of items. > > But under normal circumstances, a factor of two is nothing. > > (Unless you're working on an embedded device, where memory is really > constrained. But then you shouldn't be using CPython. Try MicroPython.) > > It's not 1970 where 64K is more memory than anyone can imagine using and we > have > to count every byte of memory. Its 2017 and memory is cheap, we can afford > to > use some memory to speed up our algorithms. Maybe I don't remember it well but we have thread on python-list (or python-ideas) where somebody had problem with memory after update to python3.5 which was following with this issue -> https://bugs.python.org/issue29949 He has enough memory (maybe 64GB or 128GB? my brain memory has limits too) but his models was very slow after upgrade python. (due to swapping) >> Or waste big memory for huge frozensets. I mean resize it to 2*N if >> its size is just N+1. > > Frozensets aren't mutable, so they never resize. Yes so it seems to be easy to allocate just precisely amount of memory. But they are created by using same quadrupling/doubling machinery as you can see: from sys import getsizeof print( [getsizeof(frozenset(range(n))) for n in range(20)] ) [224, 224, 224, 224, 224, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736, 736] I was thinking that this could be good starting point to try to make some patch. Is there any tutorial for dummy beginner how to start to be good at cpython patchwork? :) From rosuav at gmail.com Thu Jul 13 15:23:54 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Jul 2017 05:23:54 +1000 Subject: Write this accumuator in a functional style In-Reply-To: <59679f3e$0$1593$c3e8da3$5496439d@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> <3ce1cef0-7bd4-433d-9835-bbfb6df691f8@googlegroups.com> <59679f3e$0$1593$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 14, 2017 at 2:26 AM, Steve D'Aprano wrote: > On Fri, 14 Jul 2017 01:09 am, Rustom Mody wrote: > >> Couple that with the fact that space-time are not unrelated on any modern VM >> based OS + cache based hw. Doubly so for "managed" languages where gc buys >> space for time. > > I don't understand that comment. Space/time have *never* been unrelated. > > Trading off space for time (you can save memory by doing extra work, which takes > time, or save time by using more memory) is an old, old trick. It applies to > 1950s mainframes just as much as 2010s smart phones, and everything in between. > It isn't even specific to computers: what do you think a book index is, except > a way to spend extra space (more pages) to save time when looking up a topic? I think he meant the opposite correlation. Of course we know how to use space to save time, and vice versa - anyone who's tried to organize a RL bookshelf knows that - but thanks to CPU cache lines and such, the inverse correlation is very true too. (Not that it was ever completely false.) That's why, for instance, an ASCII-only string can be processed faster in Python 3.3 than in 3.2 (and yes, I know that citing this example is going to bring a certain someone out of lurking, but I don't care - it's a great example anyway). Comparing two strings for equality, assuming that they're distinct string objects, requires scanning them for a difference. If they take up half as much space, you can do the scan in less time. The upshot is that wasting space MAY slow your program down, or conversely that a more compact data structure MAY improve performance. We're getting some similar research now with the new compact dict representation. However, this is referring to *compactness*, which is not the same thing as the size of the object. Wasted space at the end of an array of pointers (as in the CPython list implementation's spare capacity) is unlikely to make a difference. But it's a question of performance, so don't trust your gut feeling (or mine) - measure it. There are a million and one considerations, including the use of free lists, the likelihood of expansion, etc, etc, etc. ChrisA From ned at nedbatchelder.com Thu Jul 13 19:06:06 2017 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 13 Jul 2017 16:06:06 -0700 (PDT) Subject: Write this accumuator in a functional style In-Reply-To: References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Thursday, July 13, 2017 at 10:59:52 AM UTC-4, Pavol Lisy wrote: > On 7/13/17, Steve D'Aprano wrote: > > > [1] Actually, CPython's lists initially quadruple the size of the array, up > > to a > > certain point, and then switch to doubling. This ensures that small lists > > have > > even fewer expensive resizes, at the cost of wasting a bit more memory, but > > its > > only a small array so who cares? > > IMHO problem is doubling size for huge lists. > > Or waste big memory for huge frozensets. I mean resize it to 2*N if > its size is just N+1. Steve's summary is qualitatively right, but a little off on the quantitative details. Lists don't resize to 2*N, they resize to ~1.125*N: new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6); (https://github.com/python/cpython/blob/master/Objects/listobject.c#L49-L58) --Ned. From steve+python at pearwood.info Thu Jul 13 20:35:41 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 14 Jul 2017 10:35:41 +1000 Subject: Grapheme clusters, a.k.a.real characters Message-ID: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> >From time to time, people discover that Python's string algorithms work on code points rather than "real characters", which can lead to anomalies like the following: s = 'x?ex' s = unicodedata.normalize('NFD', s) print(s) print(s[::-1]) which results in: x?ex x?ax If you're interested in this issue, there's an issue on the bug tracker about it, which is seeing some activity. http://bugs.python.org/issue30717 -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From jigarercivil at gmail.com Thu Jul 13 20:46:22 2017 From: jigarercivil at gmail.com (jigarercivil at gmail.com) Date: Thu, 13 Jul 2017 17:46:22 -0700 (PDT) Subject: sydney female slut gloryhole - Free In-Reply-To: <179d616f-d7d6-41bd-897c-ac7216ea5879@v39g2000pro.googlegroups.com> References: <179d616f-d7d6-41bd-897c-ac7216ea5879@v39g2000pro.googlegroups.com> Message-ID: Very much interested... 0426465115 From ben+python at benfinney.id.au Thu Jul 13 22:18:20 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 14 Jul 2017 12:18:20 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> Message-ID: <85bmonahhv.fsf@benfinney.id.au> Steve D'Aprano writes: > From time to time, people discover that Python's string algorithms work on code > points rather than "real characters", which can lead to anomalies like the > following: > > s = 'x?ex' > s = unicodedata.normalize('NFD', s) > print(s) > print(s[::-1]) > > > which results in: > > x?ex > x?ax > If you're interested in this issue Note that it depends on the difference between two apparently identical strings:: >>> s1 = 'x?ex' >>> s2 = unicodedata.normalize('NFD', s1) >>> s1, s2 ('x?ex', 'xa?ex') The strings are different, and the items you get when iterating them are different:: >>> len(s1), len(s2) (4, 5) >>> [unicodedata.name(c) for c in s1] ['LATIN SMALL LETTER X', 'LATIN SMALL LETTER A WITH DIAERESIS', 'LATIN SMALL LETTER E', 'LATIN SMALL LETTER X'] >>> [unicodedata.name(c) for c in s2] ['LATIN SMALL LETTER X', 'LATIN SMALL LETTER A', 'COMBINING DIAERESIS', 'LATIN SMALL LETTER E', 'LATIN SMALL LETTER X'] which explains why they're different when reversed:: >>> [unicodedata.name(c) for c in reversed(s1)] ['LATIN SMALL LETTER X', 'LATIN SMALL LETTER E', 'LATIN SMALL LETTER A WITH DIAERESIS', 'LATIN SMALL LETTER X'] >>> "".join(reversed(s1)) 'xe?x' >>> [unicodedata.name(c) for c in reversed(s2)] ['LATIN SMALL LETTER X', 'LATIN SMALL LETTER E', 'COMBINING DIAERESIS', 'LATIN SMALL LETTER A', 'LATIN SMALL LETTER X'] >>> "".join(reversed(s2)) 'xe?ax' -- \ ?I know that we can never get rid of religion ?. But that | `\ doesn?t mean I shouldn?t hate the lie of faith consistently and | _o__) without apology.? ?Paul Z. Myers, 2011-12-28 | Ben Finney From marko at pacujo.net Fri Jul 14 02:30:30 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 09:30:30 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> Message-ID: <87a847jzsp.fsf@elektro.pacujo.net> Ben Finney : > Steve D'Aprano writes: >> From time to time, people discover that Python's string algorithms >> work on code points rather than "real characters", which can lead to >> anomalies > > [...] >>>> [unicodedata.name(c) for c in reversed(s1)] > ['LATIN SMALL LETTER X', > 'LATIN SMALL LETTER E', > 'LATIN SMALL LETTER A WITH DIAERESIS', > 'LATIN SMALL LETTER X'] >>>> "".join(reversed(s1)) > 'xe?x' >>>> [unicodedata.name(c) for c in reversed(s2)] > ['LATIN SMALL LETTER X', > 'LATIN SMALL LETTER E', > 'COMBINING DIAERESIS', > 'LATIN SMALL LETTER A', > 'LATIN SMALL LETTER X'] >>>> "".join(reversed(s2)) > 'xe?ax' Unicode was supposed to get us out of the 8-bit locale hole. Now it seems the Unicode hole is far deeper and we haven't reached the bottom of it yet. I wonder if the hole even has a bottom. We now have: - an encoding: a sequence a bytes - a string: a sequence of integers (code points) - "a snippet of text": a sequence of characters Assuming "a sequence of characters" is the final word, and Python wants to be involved in that business, one must question the usefulness of strings, which are neither here nor there. When people use Unicode, they are expecting to be able to deal in real characters. I would expect: len(text) to give me the length in characters text[-1] to evaluate to the last character re.match("a.c", text) to match a character between a and c So the question is, should we have a third type for text. Or should the semantics of strings be changed to be based on characters? Marko From rosuav at gmail.com Fri Jul 14 03:40:06 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Jul 2017 17:40:06 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87a847jzsp.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote: > Unicode was supposed to get us out of the 8-bit locale hole. Now it > seems the Unicode hole is far deeper and we haven't reached the bottom > of it yet. I wonder if the hole even has a bottom. > > We now have: > > - an encoding: a sequence a bytes > > - a string: a sequence of integers (code points) > > - "a snippet of text": a sequence of characters Before Unicode, we had exactly the same thing, only with more encodings. > Assuming "a sequence of characters" is the final word, and Python wants > to be involved in that business, one must question the usefulness of > strings, which are neither here nor there. > > When people use Unicode, they are expecting to be able to deal in real > characters. I would expect: > > len(text) to give me the length in characters > text[-1] to evaluate to the last character > re.match("a.c", text) to match a character between a and c > > So the question is, should we have a third type for text. Or should the > semantics of strings be changed to be based on characters? What is the length of a string? How often do you actually care about the number of grapheme clusters - and not, for example, about the pixel width? (To columnate text, for instance, you need to know about its width in pixels or millimeters, not the number of characters in the line.) And if you're going to group code points together because some of them are combining characters, would you also group them together because there's a zero-width joiner in the middle? The answer will sometimes be "yes of course" and sometimes "of course not". These kinds of linguistic considerations shouldn't be codified into the core of the language. IMO the Python str type is adequate as a core data type. What we may need, though, is additional utility functions, eg: * unicodedata.grapheme_clusters(str) - split str into a sequence of grapheme clusters * pango.get_text_extents(str) - measure the pixel dimensions of a line of text * platform.punish_user() - issue a platform-dependent response (such as an electric shock, a whack with a 2x4, or a dropped anvil) on someone who has just misunderstood Unicode again * socket.punish_user() - as above, but to the user at the opposite end of a socket ChrisA From neel5481 at gmail.com Fri Jul 14 03:52:53 2017 From: neel5481 at gmail.com (neel patel) Date: Fri, 14 Jul 2017 13:22:53 +0530 Subject: Read Application python logs Message-ID: Hi, I wrote one simple C code and integrated python interpreter. I am using Python C API to run the python command. Below code used Python C API inside .c file. PyObject* PyFileObject = PyFile_FromString("test.py", (char *)"r"); int ret = PyRun_SimpleFile(PyFile_AsFile(PyFileObject), "test.py"); if (ret != 0) print("Error\n"); Above code working fine. It runs the "test.py" but inside "test.py" there is some print statement so how can i read those messages in this .c file from console ? Thanks in Advance. From marko at pacujo.net Fri Jul 14 04:15:54 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 11:15:54 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: <87zic7v3gl.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote: >> When people use Unicode, they are expecting to be able to deal in real >> characters. I would expect: >> >> len(text) to give me the length in characters >> text[-1] to evaluate to the last character >> re.match("a.c", text) to match a character between a and c >> >> So the question is, should we have a third type for text. Or should the >> semantics of strings be changed to be based on characters? > > What is the length of a string? How often do you actually care about > the number of grapheme clusters - and not, for example, about the > pixel width? A good question. I have in the past argued that the string should be a special data type for the specialist text processing needs. However, I happen to have fooled around with a character-graphics based game in recent days, and even professionally, I use character-based alignment quite often. Consider, for example, a Python source code editor where you want to limit the length of the line based on the number of characters more typically than based on the number of pixels. Furthermore, you only dismissed my question about len(text) What about text[-1] re.match("a.c", text) Marko From rosuav at gmail.com Fri Jul 14 04:30:01 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Jul 2017 18:30:01 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87zic7v3gl.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote: >>> When people use Unicode, they are expecting to be able to deal in real >>> characters. I would expect: >>> >>> len(text) to give me the length in characters >>> text[-1] to evaluate to the last character >>> re.match("a.c", text) to match a character between a and c >>> >>> So the question is, should we have a third type for text. Or should the >>> semantics of strings be changed to be based on characters? >> >> What is the length of a string? How often do you actually care about >> the number of grapheme clusters - and not, for example, about the >> pixel width? > > A good question. I have in the past argued that the string should be a > special data type for the specialist text processing needs. > > However, I happen to have fooled around with a character-graphics based > game in recent days, and even professionally, I use character-based > alignment quite often. Consider, for example, a Python source code > editor where you want to limit the length of the line based on the > number of characters more typically than based on the number of pixels. > > Furthermore, you only dismissed my question about > > len(text) > > What about > > text[-1] > re.match("a.c", text) The considerations and concerns in the second half of my paragraph - the bit you didn't quote - directly address these two. ChrisA From marko at pacujo.net Fri Jul 14 04:53:26 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 11:53:26 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> Message-ID: <87vamvv1q1.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote: >> Furthermore, you only dismissed my question about >> >> len(text) >> >> What about >> >> text[-1] >> re.match("a.c", text) > > The considerations and concerns in the second half of my paragraph - > the bit you didn't quote - directly address these two. I guess you refer to: These kinds of linguistic considerations shouldn't be codified into the core of the language. Then, why bother with Unicode to begin with? Why not just use bytes? After all, Python3's strings have the very same pitfalls: - you don't know the length of a text in characters - chr(n) doesn't return a character - you can't easily find the 7th character in a piece of text - you can't compare the equality of two pieces of text - you can't use a piece of text as a reliable dict key etc. Marko From rosuav at gmail.com Fri Jul 14 05:32:35 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Jul 2017 19:32:35 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87vamvv1q1.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote: >>> Furthermore, you only dismissed my question about >>> >>> len(text) >>> >>> What about >>> >>> text[-1] >>> re.match("a.c", text) >> >> The considerations and concerns in the second half of my paragraph - >> the bit you didn't quote - directly address these two. > > I guess you refer to: > > These kinds of linguistic considerations shouldn't be codified into > the core of the language. No, I don't. I refer to the second half of the paragraph you quoted the first half of. > Then, why bother with Unicode to begin with? Why not just use bytes? > After all, Python3's strings have the very same pitfalls: > > - you don't know the length of a text in characters > > - chr(n) doesn't return a character > > - you can't easily find the 7th character in a piece of text First you have to define "character". There are enough different definitions of "character" (for the purposes of counting/iteration/subscripting) that at least some of them have to be separate functions or methods. > - you can't compare the equality of two pieces of text > > - you can't use a piece of text as a reliable dict key (Dict key usage is defined in terms of equality, so these two are the same concern.) Yes, you can. For most purposes, textual equality should be defined in terms of NFC or NFD normalization. Python already gives you that. You could argue that a string should always be stored NFC (or NFD, take your pick), and then the equality operator would handle this; but I'm not sure the benefit is worth it. And you can't define equality by whether two strings would display identically, because then you lose semantic information (for instance, the difference between U+0020 and U+00A0, or between U+2004 and a pair of U+2006, or between U+004B and U+041A), not to mention the way that some fonts introduce confusing similarities that other fonts don't. If you're trying to use strings as identifiers in any way (say, file names, or document lookup references), using the NFC/NFD normalized form of the string should be sufficient. ChrisA From marko at pacujo.net Fri Jul 14 06:59:32 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 13:59:32 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> Message-ID: <87pod3uvvv.fsf@elektro.pacujo.net> Chris Angelico : > On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote: >> Chris Angelico : >> Then, why bother with Unicode to begin with? Why not just use bytes? >> After all, Python3's strings have the very same pitfalls: >> >> - you don't know the length of a text in characters >> - chr(n) doesn't return a character >> - you can't easily find the 7th character in a piece of text > > First you have to define "character". I'm referring to the Grapheme clusters, a.k.a.real characters >> - you can't compare the equality of two pieces of text >> - you can't use a piece of text as a reliable dict key > > (Dict key usage is defined in terms of equality, so these two are the > same concern.) Ideally, yes. However, someone might say, "don't use == to compare equality; use unicode.textually_equal() instead". That advise might satisfy the first requirement but not the second. > Yes, you can. For most purposes, textual equality should be defined in > terms of NFC or NFD normalization. Python already gives you that. You > could argue that a string should always be stored NFC (or NFD, take > your pick), and then the equality operator would handle this; but I'm > not sure the benefit is worth it. As I said, Python3's strings are neither here nor there. They don't quite solve the problem Python2's strings had. They will push the internationalization problems a bit farther out but fall short of the mark. he developer still has to worry a lot. Unicode seemingly solved one problem only to present the developer of a bagful of new problems. And if Python3's strings are a half-measure, why not stick to bytes? > If you're trying to use strings as identifiers in any way (say, file > names, or document lookup references), using the NFC/NFD normalized > form of the string should be sufficient. Show me ten Python3 database applications, and I'll show you ten Python3 database applications that don't normalize their primary keys. Marko From marko at pacujo.net Fri Jul 14 08:05:48 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 15:05:48 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> Message-ID: <87k23bustf.fsf@elektro.pacujo.net> Marko Rauhamaa : > Chris Angelico : >> If you're trying to use strings as identifiers in any way (say, file >> names, or document lookup references), using the NFC/NFD normalized >> form of the string should be sufficient. > > Show me ten Python3 database applications, and I'll show you ten Python3 > database applications that don't normalize their primary keys. Besides the normal forms don't help you do text processing (no regular expression matching, no simple way to get a real character). Marko From steve+python at pearwood.info Fri Jul 14 08:07:03 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 14 Jul 2017 22:07:03 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> On Fri, 14 Jul 2017 04:30 pm, Marko Rauhamaa wrote: > Unicode was supposed to get us out of the 8-bit locale hole. Which it has done. Apart from use for backwards compatibility, there is no good reason to use to use the masses of legacy extensions to ASCII or the technical fragile non-Unicode multibyte encodings from China and Japan. Backwards compatibility is important, but for new content we should all support Unicode. > Now it > seems the Unicode hole is far deeper and we haven't reached the bottom > of it yet. I wonder if the hole even has a bottom. This is not a Unicode hole. This is a human languages hole, compounded by the need for backwards compatibility with legacy encodings. > We now have: > > - an encoding: a sequence a bytes > > - a string: a sequence of integers (code points) > > - "a snippet of text": a sequence of characters I'm afraid that's wrong, and much too simplified. What we have had, ever since computers started having standards for the storage and representation of text (i.e. since EBCDIC at the very least, possibly even earlier), is: (1) A **character set** made up of some collection of: - alphabetical letters, characters, syllabograms, ideographs or logographs - digits and other numeric symbols - punctuation marks - other textual marks, including diacritics ("accent marks") - assorted symbols, icons, pictograms or hieroglyphics - control and formatting codes - white space and other text separators - and any other entities that have text-like semantics. The character set is the collection of entities we would like to represent as computer data. But of course computers can't store "the letter Aye" A or "the letter Zhe" ? so we also need: (2) A (possibly implicit) mapping between the entities in the character set and some contiguous range of abstract numeric values ("code points"). (3) The **encoding**, an explicit mapping between those abstract code points and some concrete representation suitable for use as storage or transmission by computers. That is usually which means a sequence of "code units", where each code unit is typically one, two or four bytes. Note that a single character set could have multiple encodings. In pre-Unicode encodings such as ASCII, the difference between (1) and (2) was frequently (always?) glossed over. For example, in ASCII: - the character set was made up of 128 control characters, American English letters, digits and punctuation marks; - there is an implicit mapping between (say) "character A is code point 65"; - there is also an explicit mapping between "character A (i.e. code point 65) is byte 0x41 (decimal 65)". So the legacy character set and encoding standards helped cause confusion, by implying that "characters are bytes" instead of making the difference explicit. In addition, we have: (4) Strings, ropes and other data structures suitable for the storage of **sequences of code points** (characters, codes, symbols etc); strings being the simplest implementation (a simple array of code units), but they're not the only one. We also have: (5) Human-meaningful chunks of text: characters, graphemes, words, sentences, symbols, paragraphs, pages, sections, chapters, snippets or what have you. There's no direct one-to-one correspondence between (5) and (4). A string can just as easily contain half a word "aard" as a full word "aardvark". And let's not forget: (6) The **glyphs** of each letter, symbol, etc, encompassing the visual shape and design of those chunks of text, which can depend on the context. For example, the Greek letter sigma looks different depending on whether it is at the end of a word or not. > Assuming "a sequence of characters" is the final word, Why would you assume that? Let's start with, what's a character? > and Python wants > to be involved in that business, one must question the usefulness of > strings, which are neither here nor there. Sure, you can question anything you like, its a free country[1], but unless you have a concrete plan for something better and are willing to implement it, the chances are very high that nothing will happen. The vast majority of programming languages provide only a set of low-level primitives for manipulating strings, with no semantic meaning enforced. If you want to give *human meaning* to your strings, you need something more than just the string-handling primitives your computer language provides. This was just as true in the old days of ASCII as it is today with Unicode: your computer language is just as happy making a string containing the nonsense word "vxtsEpdlu" as the real word "Norwegian". > When people use Unicode, they are expecting to be able to deal in real > characters. Then their expectations are too high and they are misinformed. Unicode is not a standard for implementing human-meaningful text (although it takes a few steps towards such a standard). Unicode doesn't even have a concept of "character". Indeed, as I hinted above by asking you what is a character, such a concept isn't well defined. Unicode prefers to use the technical term "grapheme", which (usually) encompasses what "ordinary people consider a character in their native language". If this strikes you as complicated, well, yes, it is complicated. The writing systems of the world ARE complicated, and they clash. > I would expect: > > len(text) to give me the length in characters > text[-1] to evaluate to the last character > re.match("a.c", text) to match a character between a and c Until we have agreement on what is a character, we can't judge whether or not this is meaningful. For example: - do '?' and '?' count as the same character? - is '%' one symbol or three? How about '?'? If I write it as '1/2' does it make a difference? - are ligatures like '?' one or two letters? - when should '?' uppercase to 'SS' and when to '?'? - do you lowercase 'SS' to 'ss' or '?'? - do you uppercase 'i' to 'I' or '?'? - can we distinguish between 'I' as in me and 'I' as in the Roman numeral 1? - should the English letter 'a' with a hook on the top be treated as different to the letter 'a' without the hook on top? - should English italic letters get their own code point? - how about Cyrillic italic letters? - does it make a difference if we're using them as mathematical symbols? - should we have a separate 'A' for English, French, German, Spanish, Portuguese, Norwegian, Dutch, Italian, etc? - how about a separate '?' for Chinese, Japanese and Korean? These are only a *few* of the *easy* questions that need to be answered before we can even consider your question: > So the question is, should we have a third type for text. Or should the > semantics of strings be changed to be based on characters? [1] For now. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Fri Jul 14 08:32:15 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Jul 2017 22:32:15 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87pod3uvvv.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 14, 2017 at 8:59 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote: >>> Chris Angelico : >>> Then, why bother with Unicode to begin with? Why not just use bytes? >>> After all, Python3's strings have the very same pitfalls: >>> >>> - you don't know the length of a text in characters >>> - chr(n) doesn't return a character >>> - you can't easily find the 7th character in a piece of text >> >> First you have to define "character". > > I'm referring to the > > Grapheme clusters, a.k.a.real characters Okay. Just as long as you know that that's not the only valid definition. >> Yes, you can. For most purposes, textual equality should be defined in >> terms of NFC or NFD normalization. Python already gives you that. You >> could argue that a string should always be stored NFC (or NFD, take >> your pick), and then the equality operator would handle this; but I'm >> not sure the benefit is worth it. > > As I said, Python3's strings are neither here nor there. They don't > quite solve the problem Python2's strings had. They will push the > internationalization problems a bit farther out but fall short of the > mark. > > he developer still has to worry a lot. Unicode seemingly solved one > problem only to present the developer of a bagful of new problems. > > And if Python3's strings are a half-measure, why not stick to bytes? Python's float type can't represent all possible non-integer values. If it's such a half-measure, why not stick to integers and do all your own fraction handling? >> If you're trying to use strings as identifiers in any way (say, file >> names, or document lookup references), using the NFC/NFD normalized >> form of the string should be sufficient. > > Show me ten Python3 database applications, and I'll show you ten Python3 > database applications that don't normalize their primary keys. I don't have ten open source ones handy, but I can tell you for sure that I've worked with far more than ten that don't NEED to normalize their primary keys. Why? Because they are *by definition* normal already. Mostly because they use integers for keys. Tada! Normalization is unnecessary. ChrisA From rosuav at gmail.com Fri Jul 14 08:33:23 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Jul 2017 22:33:23 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87k23bustf.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> Message-ID: On Fri, Jul 14, 2017 at 10:05 PM, Marko Rauhamaa wrote: > Marko Rauhamaa : > >> Chris Angelico : >>> If you're trying to use strings as identifiers in any way (say, file >>> names, or document lookup references), using the NFC/NFD normalized >>> form of the string should be sufficient. >> >> Show me ten Python3 database applications, and I'll show you ten Python3 >> database applications that don't normalize their primary keys. > > Besides the normal forms don't help you do text processing (no regular > expression matching, no simple way to get a real character). What do you mean about regular expressions? You can use REs with normalized strings. And if you have any valid definition of "real character", you can use it equally on an NFC-normalized or NFD-normalized string than any other. They're just strings, you know. ChrisA From marko at pacujo.net Fri Jul 14 09:31:33 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 16:31:33 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87fudzuoui.fsf@elektro.pacujo.net> Steve D'Aprano : > These are only a *few* of the *easy* questions that need to be > answered before we can even consider your question: > >> So the question is, should we have a third type for text. Or should >> the semantics of strings be changed to be based on characters? Sure, but if they can't be answered, what good is there in having strings (as opposed to bytes). What problem do strings solve? What operation depends on (or is made simpler) by having strings (instead of bytes)? We are not even talking about some exotic languages, but the problem is right there in the middle of Latin-1. We can't even say what len("?") should return. And we may experience: >>> ord("?")Traceback (most recent call last): File "", line 1, in TypeError: ord() expected a character, but string of length 2 found Of course, UTF-8 in a bytes object doesn't make the situation any better, but does it make it any worse? As it stands, we have ? --[encode>-- Unicode --[reencode>-- UTF-8 Why is one encoding format better than the other? Marko From jorge.conrado at cptec.inpe.br Fri Jul 14 09:57:19 2017 From: jorge.conrado at cptec.inpe.br (jorge.conrado at cptec.inpe.br) Date: Fri, 14 Jul 2017 10:57:19 -0300 Subject: PYTHON GDAL Message-ID: <46ec7dd54738e48827bcc511041ce911@cptec.inpe.br> Hi, I installed the GDAL 2.2.1 using conda. Then I did: import gdal and I had: Traceback (most recent call last): File "", line 1, in File "/home/conrado/miniconda2/lib/python2.7/site-packages/gdal.py", line 2, in from osgeo.gdal import deprecation_warn File "/home/conrado/miniconda2/lib/python2.7/site-packages/osgeo/__init__.py", line 21, in _gdal = swig_import_helper() File "/home/conrado/miniconda2/lib/python2.7/site-packages/osgeo/__init__.py", line 17, in swig_import_helper _mod = imp.load_module('_gdal', fp, pathname, description) ImportError: libicui18n.so.56: cannot open shared object file: No such file or directory then I used the command find: find . -name 'libicui18n.so.56' -print and I had: ./usr/local/lib/python3.6/site-packages/PyQt5/Qt/lib/libicui18n.so.56 Please, what can I do to put (set) this lib for python2.7 recognize it. Thanks, Conrado From rhodri at kynesim.co.uk Fri Jul 14 10:05:37 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 14 Jul 2017 15:05:37 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87fudzuoui.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: On 14/07/17 14:31, Marko Rauhamaa wrote: > Of course, UTF-8 in a bytes object doesn't make the situation any > better, but does it make it any worse? Speaking as someone who has been up to his elbows in this recently, I would say emphatically that it does make things worse. It adds an extra layer of complexity to all of the questions you were asking, and more. A single codepoint is a meaningful thing, even if its meaning may be modified by combining. A single byte may or may not be meaningful. -- Rhodri James *-* Kynesim Ltd From marko at pacujo.net Fri Jul 14 10:14:39 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 17:14:39 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <87bmonumuo.fsf@elektro.pacujo.net> Rhodri James : > On 14/07/17 14:31, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > > Speaking as someone who has been up to his elbows in this recently, I > would say emphatically that it does make things worse. It adds an > extra layer of complexity to all of the questions you were asking, and > more. A single codepoint is a meaningful thing, even if its meaning > may be modified by combining. A single byte may or may not be > meaningful. I'd like to understand this better. Maybe you have a couple of examples to share? Marko From torriem at gmail.com Fri Jul 14 10:30:24 2017 From: torriem at gmail.com (Michael Torrie) Date: Fri, 14 Jul 2017 08:30:24 -0600 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87fudzuoui.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: > Of course, UTF-8 in a bytes object doesn't make the situation any > better, but does it make it any worse? > > As it stands, we have > > ? --[encode>-- Unicode --[reencode>-- UTF-8 > > Why is one encoding format better than the other? This is precisely the logic behind Google using UTF-8 for strings in Go, rather than having some O(1) abstract type like Python has. And many other languages do the same. The argument is that because of the very issues that you mention, having O(1) lookup in a string isn't that important, since looking up a particular index in a unicode string is rarely the right thing to do, so UTF-8 is just fine as a native, in-memory type. From torriem at gmail.com Fri Jul 14 10:32:27 2017 From: torriem at gmail.com (Michael Torrie) Date: Fri, 14 Jul 2017 08:32:27 -0600 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: On 07/14/2017 08:05 AM, Rhodri James wrote: > On 14/07/17 14:31, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > > Speaking as someone who has been up to his elbows in this recently, I > would say emphatically that it does make things worse. It adds an extra > layer of complexity to all of the questions you were asking, and more. > A single codepoint is a meaningful thing, even if its meaning may be > modified by combining. A single byte may or may not be meaningful. Are you saying that dealing with Unicode in Google Go, which uses UTF-8 in memory, is adding an extra layer of complexity and makes things worse than they might be in Python? From rhodri at kynesim.co.uk Fri Jul 14 10:48:56 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 14 Jul 2017 15:48:56 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87bmonumuo.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <87bmonumuo.fsf@elektro.pacujo.net> Message-ID: <68759124-fd77-669a-f0fc-99fd1c95e442@kynesim.co.uk> On 14/07/17 15:14, Marko Rauhamaa wrote: > Rhodri James : > >> On 14/07/17 14:31, Marko Rauhamaa wrote: >>> Of course, UTF-8 in a bytes object doesn't make the situation any >>> better, but does it make it any worse? >> >> Speaking as someone who has been up to his elbows in this recently, I >> would say emphatically that it does make things worse. It adds an >> extra layer of complexity to all of the questions you were asking, and >> more. A single codepoint is a meaningful thing, even if its meaning >> may be modified by combining. A single byte may or may not be >> meaningful. > > I'd like to understand this better. Maybe you have a couple of examples > to share? Sure. What I've mostly been looking at recently has been the Expat XML parser. XML chooses to deal with one of your problems by defining that it's not having anything to do with combining, sequences of codepoints are all you need to worry about when comparing strings. U+00E8 (LATIN SMALL LETTER E WITH GRAVE) is not the same as U+0065 (LATIN SMALL LETTER E) followed by U+0300 (COMBINING GRAVE ACCENT) for example. However Expat is written in C, and it reads in UTF-8 as a sequence of bytes. There are endless checks all over the code that complete UTF-8 byte sequences have been read in or passed across functional interfaces. When you are dealing with a bytestream like this, you cannot assume that have complete codepoints, and you cannot find codepoint boundaries without searching along the string. It's only once you have reconstructed the codepoint that you can tell what sort of character you have, and whether or not it is valid in your parsing context. -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Fri Jul 14 10:52:10 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Fri, 14 Jul 2017 15:52:10 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <557e6218-9d71-04be-376b-2aef240f0e36@kynesim.co.uk> On 14/07/17 15:32, Michael Torrie wrote: > On 07/14/2017 08:05 AM, Rhodri James wrote: >> On 14/07/17 14:31, Marko Rauhamaa wrote: >>> Of course, UTF-8 in a bytes object doesn't make the situation any >>> better, but does it make it any worse? >> >> Speaking as someone who has been up to his elbows in this recently, I >> would say emphatically that it does make things worse. It adds an extra >> layer of complexity to all of the questions you were asking, and more. >> A single codepoint is a meaningful thing, even if its meaning may be >> modified by combining. A single byte may or may not be meaningful. > > Are you saying that dealing with Unicode in Google Go, which uses UTF-8 > in memory, is adding an extra layer of complexity and makes things worse > than they might be in Python? I'm not familiar with Go. If the programmer has to be aware that the she is using UTF-8 under the hood, then yes, it does add an extra layer of complexity. You have to remember the rules of UTF-8 as well as everything else. -- Rhodri James *-* Kynesim Ltd From rosuav at gmail.com Fri Jul 14 10:54:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 15 Jul 2017 00:54:48 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: On Sat, Jul 15, 2017 at 12:32 AM, Michael Torrie wrote: > On 07/14/2017 08:05 AM, Rhodri James wrote: >> On 14/07/17 14:31, Marko Rauhamaa wrote: >>> Of course, UTF-8 in a bytes object doesn't make the situation any >>> better, but does it make it any worse? >> >> Speaking as someone who has been up to his elbows in this recently, I >> would say emphatically that it does make things worse. It adds an extra >> layer of complexity to all of the questions you were asking, and more. >> A single codepoint is a meaningful thing, even if its meaning may be >> modified by combining. A single byte may or may not be meaningful. > > Are you saying that dealing with Unicode in Google Go, which uses UTF-8 > in memory, is adding an extra layer of complexity and makes things worse > than they might be in Python? Can you reverse a string in Go? How do you do it? With Python, you can sometimes get tripped up, eg if you have: * combining characters * Arabic letters, which can look very different when reordered * explicit directionality markers But the semantics are at least easy to comprehend: you have a strict reversal of code unit order. So you can reverse a string for parsing purposes, and then re-reverse the subsections. If you have a UTF-8 bytestring, a naive reversal will trip you up if you have *any* non-ASCII values in there. You will have invalid UTF-8. So *at very least*, your "reverse string" code has to be UTF-8 aware - it has to keep continuation bytes with the correct start byte. And you *still* have all the concerns that Python has. Extra complexity. QED. ChrisA From no.email at nospam.invalid Fri Jul 14 12:23:04 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Fri, 14 Jul 2017 09:23:04 -0700 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> Message-ID: <877ezb2djr.fsf@nightsong.com> Rustom Mody writes: > Yeah I know append method is supposedly O(1). It's amortized O(1). From steve+python at pearwood.info Fri Jul 14 12:50:33 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 15 Jul 2017 02:50:33 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote: > Steve D'Aprano : > >> These are only a *few* of the *easy* questions that need to be >> answered before we can even consider your question: >> >>> So the question is, should we have a third type for text. Or should >>> the semantics of strings be changed to be based on characters? > > Sure, but if they can't be answered, what good is there in having > strings (as opposed to bytes). I didn't say they can't be answered. But however you answer them, you're going to make somebody angry. I notice you haven't given a definition for "character" yet. It's easy to be critical and complain that Unicode strings don't handle "characters", but if you can't suggest any improvements, then you're just bellyaching. Do you have some concrete improvements in mind? > What problem do strings solve? Well, to start with it's a lot nicer to be able to write: name = input("What is your name?") instead of: name = input("5768617420697320796f7572206e616d653f") don't you think? I think that alone makes strings worth it. And of course, I don't want to be limited to just US English, or one language at a time. So we need a universal character set. > What > operation depends on (or is made simpler) by having strings (instead of > bytes)? Code is written for people first, and to be executed by a computer only second. So we want human-readable text to look as much like human-readable text. Although I suppose computer keyboards would be a lot smaller if they only needed 16 keys marked 0...9ABCDEF instead of what we have now. We could program by entering bytes: 6e616d65203d20696e70757428225768617420697320796f7572206e616d653f22290a7072696e742822596f7572206e616d652069732025722e222025206e616d6529 although debugging would be a tad more difficult, I expect. But the advantage is, we'd have one less data type! I mean, sure, *some* stick-in-the-mud old fashioned programmers would prefer to write: name = input("What is your name?") print("Your name is %r." % name) but I think your suggestion of eliminating strings and treating everything as bytes has its advantages. For starters, everything is a one-liner! Bytes, being a sequence of numbers, shouldn't define text operations like converting to uppercase, regular expressions, and so forth. Of course the Python 3 bytes data type does support some limited text operations, but that's for backward compatibility with pre-Unicode Python, and its limited to ASCII. If we were designing Python from scratch, I'd argue strongly against adding text methods to a sequence of numbers. > We are not even talking about some exotic languages, but the problem is > right there in the middle of Latin-1. We can't even say what > > len("?") > > should return. Latin-1 predates Unicode, so this problem has existed for a long time. It's not something that Unicode has introduced, it is inherent to the problem of dealing with human language in its full generality. Do you have a solution for this? How do you get WYSIWYG display of text without violating the expectation that we should be able to count the length of a string? Before you answer, does your answer apply to Arabic and Thai as well as Western European languages? > And we may experience: > > >>> ord("?")Traceback (most recent call last): > File "", line 1, in > TypeError: ord() expected a character, but string of length 2 found You might, but only as a contrived example. You had to intentionally create a decomposed string of length two as a string literal, and then call ord(). But of course you knew that was going to happen -- its not something likely to happen by accident. In practice, when you receive an arbitrary string, you test its length before calling ord(). Or you walk the string calling ord() on each code point. > Of course, UTF-8 in a bytes object doesn't make the situation any > better, but does it make it any worse? Sure it does. You want the human reader to be able to predict the number of graphemes ("characters") by sight. Okay, here's a string in UTF-8, in bytes: e288b4c39fcf89e289a0d096e280b0e282ac78e2889e How do you expect the human reader to predict the number of graphemes from a UTF-8 hex string? For the record, that's 44 hex digits or 22 bytes, to encode 9 graphemes. That's an average of 2.44 bytes per grapheme. Would you expect the average programmer to be able to predict where the grapheme breaks are? > As it stands, we have > > ? --[encode>-- Unicode --[reencode>-- UTF-8 I can't even work out what you're trying to say here. > Why is one encoding format better than the other? It depends on what you're trying to do. If you want to minimize storage and transmission costs, and don't care about random access into the string, then UTF-8 is likely the best encoding, since it uses as little as one byte per code point, and in practice with real-world text (at least for Europeans) it is rarely more expensive than the alternatives. It also has the advantage of being backwards compatible with ASCII, so legacy applications that assume all characters are a single byte will work if you use UTF-8 and limit yourself to the ASCII-compatible subset of Unicode. The disadvantage is that each code point can be one, two, three or four bytes wide, and naively shuffling bytes around will invariably give you invalid UTF-8 and cause data loss. So UTF-8 is not so good as the in-memory representation of text strings. If you have lots of memory, then UTF-32 is the best for in-memory representation, because its a fixed-width encoding and parsing it is simple. Every code point is just four bytes and you an easily implement random access into the string. If you want a reasonable compromise, UTF-16 is quite decent. If you're willing to limit yourself to the first 2**16 code points of Unicode, you can even pretend that its a fixed width encoding like UTF-32. If you have to survive transmission through machines that require 7-bit clean bytes, then UTF-7 is the best encoding to use. As for the legacy encodings: - they're not 7-bit clean, except for ASCII; - some of them are variable-width; - none of them support the full range of Unicode, so they aren't universal character sets; - in other words, you either resign yourself to being unable to exchange documents with other people, resign yourself to dealing with moji-bake, or invent some complex and non-backwards-compatible in-band mechanism for switching charsets; - they suffer from the exact same problems as Unicode regarding the distinction between code points and graphemes; - so not only do they lack the advantages of Unicode, but they have even more disadvantages. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 14 12:52:00 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 15 Jul 2017 02:52:00 +1000 Subject: Write this accumuator in a functional style References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> Message-ID: <5968f6b1$0$1585$c3e8da3$5496439d@news.astraweb.com> On Fri, 14 Jul 2017 09:06 am, Ned Batchelder wrote: > Steve's summary is qualitatively right, but a little off on the quantitative > details. Lists don't resize to 2*N, they resize to ~1.125*N: > > new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6); > > (https://github.com/python/cpython/blob/master/Objects/listobject.c#L49-L58) Ah, thanks for the correction. I was going off vague memories of long-ago discussion (perhaps even as long ago as Python 1.5!) when Tim Peters (I think it was) described how list overallocation worked. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From fabien.maussion at gmail.com Fri Jul 14 13:06:45 2017 From: fabien.maussion at gmail.com (Fabien) Date: Fri, 14 Jul 2017 19:06:45 +0200 Subject: PYTHON GDAL References: <46ec7dd54738e48827bcc511041ce911@cptec.inpe.br> Message-ID: On 07/14/2017 03:57 PM, jorge.conrado at cptec.inpe.br wrote: > > > Hi, > > I installed the GDAL 2.2.1 using conda. Then I did: > > import gdal > > and I had: > > > Traceback (most recent call last): > File "", line 1, in > File "/home/conrado/miniconda2/lib/python2.7/site-packages/gdal.py", > line 2, in > from osgeo.gdal import deprecation_warn > File > "/home/conrado/miniconda2/lib/python2.7/site-packages/osgeo/__init__.py", line > 21, in > _gdal = swig_import_helper() > File > "/home/conrado/miniconda2/lib/python2.7/site-packages/osgeo/__init__.py", line > 17, in swig_import_helper > _mod = imp.load_module('_gdal', fp, pathname, description) > ImportError: libicui18n.so.56: cannot open shared object file: No such > file or directory > > > then I used the command find: > > find . -name 'libicui18n.so.56' -print > > and I had: > > ./usr/local/lib/python3.6/site-packages/PyQt5/Qt/lib/libicui18n.so.56 > > > Please, what can I do to put (set) this lib for python2.7 recognize it. Since you are using conda I *strongly* recommend to use the conda-forge channel to install GDAL: conda install -c conda-forge gdal https://conda-forge.org/ > > > Thanks, > > > Conrado From marko at pacujo.net Fri Jul 14 14:10:38 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 21:10:38 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> Message-ID: <8760eukhy9.fsf@elektro.pacujo.net> Steve D'Aprano : > On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > > Sure it does. You want the human reader to be able to predict the > number of graphemes ("characters") by sight. Okay, here's a string in > UTF-8, in bytes: > > e288b4c39fcf89e289a0d096e280b0e282ac78e2889e > > How do you expect the human reader to predict the number of graphemes > from a UTF-8 hex string? > > For the record, that's 44 hex digits or 22 bytes, to encode 9 > graphemes. That's an average of 2.44 bytes per grapheme. Would you > expect the average programmer to be able to predict where the grapheme > breaks are? > >> As it stands, we have >> >> ? --[encode>-- Unicode --[reencode>-- UTF-8 > > I can't even work out what you're trying to say here. I can tell, yet that doesn't prevent you from dismissing what I'm saying. >> Why is one encoding format better than the other? > > It depends on what you're trying to do. > > If you want to minimize storage and transmission costs, and don't care > about random access into the string, then UTF-8 is likely the best > encoding, since it uses as little as one byte per code point, and in > practice with real-world text (at least for Europeans) it is rarely > more expensive than the alternatives. Python3's strings don't give me any better random access than UTF-8. Storage and transmission costs are not an issue. It's only that storage and transmission are still defined in terms of bytes. Python3's strings force you to encode/decode between strings and bytes for a yet-to-be-specified advantage. > It also has the advantage of being backwards compatible with ASCII, so > legacy applications that assume all characters are a single byte will > work if you use UTF-8 and limit yourself to the ASCII-compatible > subset of Unicode. UTF-8 is perfectly backward-compatible with ASCII. > The disadvantage is that each code point can be one, two, three or > four bytes wide, and naively shuffling bytes around will invariably > give you invalid UTF-8 and cause data loss. So UTF-8 is not so good as > the in-memory representation of text strings. The in-memory representation is not an issue. It's the abstract semantics that are the issue. At the abstract level, we have the text in a human language. Neither strings nor UTF-8 provide that so we have to settle for something cruder. I have yet to hear why a string does a better job than UTF-8. > If you have lots of memory, then UTF-32 is the best for in-memory > representation, because its a fixed-width encoding and parsing it is > simple. Every code point is just four bytes and you an easily > implement random access into the string. The in-memory representation is not an issue. It's the abstract semantics that are the issue. > If you want a reasonable compromise, UTF-16 is quite decent. If you're > willing to limit yourself to the first 2**16 code points of Unicode, > you can even pretend that its a fixed width encoding like UTF-32. UTF-16 (used by Windows and Java, for example) is even worse than strings and UTF-8 because: ? --[encode>-- Unicode --[reencode>-- UTF-16 --[reencode>-- bytes > If you have to survive transmission through machines that require > 7-bit clean bytes, then UTF-7 is the best encoding to use. I don't know why that is coming into this discussion. So no raison-d'?tre has yet been offered for strings. Marko From neilc at norwich.edu Fri Jul 14 14:22:46 2017 From: neilc at norwich.edu (Neil Cerutti) Date: Fri, 14 Jul 2017 18:22:46 +0000 (UTC) Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <557e6218-9d71-04be-376b-2aef240f0e36@kynesim.co.uk> Message-ID: On 2017-07-14, Rhodri James wrote: > On 14/07/17 15:32, Michael Torrie wrote: >> Are you saying that dealing with Unicode in Google Go, which >> uses UTF-8 in memory, is adding an extra layer of complexity >> and makes things worse than they might be in Python? > > I'm not familiar with Go. If the programmer has to be aware > that the she is using UTF-8 under the hood, then yes, it does > add an extra layer of complexity. You have to remember the > rules of UTF-8 as well as everything else. Go represents strings as sequences of bytes. It provides separate API's that allow you to regard those bytes as either plain old bytes, or as a sequence of runes (not-necessarily normalized codepoints). If your bytes strings aren't in UTF-8, then Go Away. https://blog.golang.org/strings -- Neil Cerutti From marko at pacujo.net Fri Jul 14 15:00:03 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 22:00:03 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <87bmonumuo.fsf@elektro.pacujo.net> <68759124-fd77-669a-f0fc-99fd1c95e442@kynesim.co.uk> Message-ID: <871spikfnw.fsf@elektro.pacujo.net> Rhodri James : > On 14/07/17 15:14, Marko Rauhamaa wrote: >> I'd like to understand this better. Maybe you have a couple of >> examples to share? > > Sure. > > What I've mostly been looking at recently has been the Expat XML parser. > XML chooses to deal with one of your problems by defining that it's not > having anything to do with combining, sequences of codepoints are all > you need to worry about when comparing strings. U+00E8 (LATIN SMALL > LETTER E WITH GRAVE) is not the same as U+0065 (LATIN SMALL LETTER E) > followed by U+0300 (COMBINING GRAVE ACCENT) for example. Very interesting. The relevant W3C spec confirms what you said: 5. Test the resulting sequences of code points bit-by-bit for identity. [...] This document therefore recommends, when possible, that all content be stored and exchanged in Unicode Normalization Form C (NFC). Marko From marko at pacujo.net Fri Jul 14 15:09:18 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Fri, 14 Jul 2017 22:09:18 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <87tw2ej0o1.fsf@elektro.pacujo.net> Michael Torrie : > On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? >> >> As it stands, we have >> >> ? --[encode>-- Unicode --[reencode>-- UTF-8 >> >> Why is one encoding format better than the other? > > This is precisely the logic behind Google using UTF-8 for strings in > Go, rather than having some O(1) abstract type like Python has. And > many other languages do the same. The argument is that because of the > very issues that you mention, having O(1) lookup in a string isn't > that important, since looking up a particular index in a unicode > string is rarely the right thing to do, so UTF-8 is just fine as a > native, in-memory type. It pays to come in late. Windows NT and Java evaded the 8-bit localization nightmare by going UCS-2. Python3 managed not to repeat the earlier UCS-2 blunders by going all the way to UCS-4. Go saw the futility of UCS-4 as a separate data type and dropped down to UTF-8. Unfortunately, Guile is following in Python3's footsteps. Marko From tjreedy at udel.edu Fri Jul 14 17:12:10 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Jul 2017 17:12:10 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: On 7/14/2017 10:30 AM, Michael Torrie wrote: > On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > >> >> As it stands, we have >> >> ? --[encode>-- Unicode --[reencode>-- UTF-8 >> >> Why is one encoding format better than the other? All digital data are ultimately bits, usually collected together in groups of 8, called bytes. The point of python 3 is that text should normally be instances of a text class, separate from the raw bytes class, with a defined internal encoding. The actual internal encoding is secondary. And it changed in 3.3. Python ints are encoded bytes, so are floats, and everything else. When one prints a float, one certainly does not see a representation of the raw bytes in the float object. Instead, one sees a representation of the value it represents. There is a proposal to change the internal encoding of int, as least on 64-bit machines, which are now standard. However, because print(87987282738472387429748) prints 87987282738472387429748 and not the internal bytes, the change in the internal bytes will not affect the user view of ints. > This is precisely the logic behind Google using UTF-8 for strings in Go, > rather than having some O(1) abstract type like Python has. And many > other languages do the same. The argument is that because of the very > issues that you mention, having O(1) lookup in a string isn't that > important, since looking up a particular index in a unicode string is > rarely the right thing to do, so UTF-8 is just fine as a native, > in-memory type. Does go use bytes for text, like most people did in Python 2, a separate text string class, that hides the internal encoding format and implementation? In other words, if you do the equivalent of print(s) where s is a text string with a mixture of greek, cyrillic, hindi, chinese, japanese, and korean chars, do you see the characters, or some representation of the internal bytes? -- Terry Jan Reedy From marko at pacujo.net Fri Jul 14 17:51:19 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Sat, 15 Jul 2017 00:51:19 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <87fudyit60.fsf@elektro.pacujo.net> Terry Reedy : > On 7/14/2017 10:30 AM, Michael Torrie wrote: >> On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: >>> Of course, UTF-8 in a bytes object doesn't make the situation any >>> better, but does it make it any worse? >> >>> >>> As it stands, we have >>> >>> ? --[encode>-- Unicode --[reencode>-- UTF-8 >>> >>> Why is one encoding format better than the other? > > All digital data are ultimately bits, usually collected together in > groups of 8, called bytes. Naturally. > The point of python 3 is that text should normally be instances of a > text class, separate from the raw bytes class, with a defined internal > encoding. And I called its usefulness into question. >> This is precisely the logic behind Google using UTF-8 for strings in Go, >> rather than having some O(1) abstract type like Python has. And many >> other languages do the same. The argument is that because of the very >> issues that you mention, having O(1) lookup in a string isn't that >> important, since looking up a particular index in a unicode string is >> rarely the right thing to do, so UTF-8 is just fine as a native, >> in-memory type. > > Does go use bytes for text, like most people did in Python 2, Yes. Also, C and the GNU textutils do that. > a separate text string class, that hides the internal encoding format > and implementation? In other words, if you do the equivalent of > print(s) where s is a text string with a mixture of greek, cyrillic, > hindi, chinese, japanese, and korean chars, do you see the characters, > or some representation of the internal bytes? Yes, in Python2, Go, C and GNU textutils, when you print a text string containing a mixture of languages, you see characters. Why? Because that's what the terminal emulator chooses to do upon receiving those bytes. Marko From tjreedy at udel.edu Fri Jul 14 20:02:48 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Jul 2017 20:02:48 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87fudyit60.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <87fudyit60.fsf@elektro.pacujo.net> Message-ID: On 7/14/2017 5:51 PM, Marko Rauhamaa wrote: > Yes, in Python2, Go, C and GNU textutils, when you print a text string > containing a mixture of languages, you see characters. > > Why? > > Because that's what the terminal emulator chooses to do upon receiving > those bytes. >>> s = u'\u1171\u2222\u3333\u4444\u5555' >>> s u'\u1171\u2222\u3333\u4444\u5555' >>> print(s) ????? >>> b = s.encode('utf-8') >>> b '\xe1\x85\xb1\xe2\x88\xa2\xe3\x8c\xb3\xe4\x91\x84\xe5\x95\x95' >>> print(b) ??????????????? I prefer the accurate 5 char print of the text string to the print of the bytes. -- Terry Jan Reedy From sonnichs at gmail.com Fri Jul 14 21:04:59 2017 From: sonnichs at gmail.com (F S) Date: Fri, 14 Jul 2017 18:04:59 -0700 (PDT) Subject: pyserial and end-of-line specification Message-ID: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> I just started using Python and I am writing code to access my serial port using pyserial. I have no problem with unix based text coming in the stream using a LF (0x0A) record separator. I also am using unblocked IO. However I have some sensor devices that use the windows CRLF (0x0A,0x0D) record separators and also a 0x03 and 0x02 (stx,etx) framing so I need to change the EOL (end of line) specfier in order to get the pyserial readline to so this. I read the doc page for pyserial and they allude to using TextIOWrapper: to accomplish this however the example is very unclear and I could not find better information on the IO page. I would appreciate any advice on how to block the records using "x0Ax0D" and "x03". Thanks Fritz From steve+python at pearwood.info Fri Jul 14 21:20:33 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 15 Jul 2017 11:20:33 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <59696de2$0$1603$c3e8da3$5496439d@news.astraweb.com> On Sat, 15 Jul 2017 07:12 am, Terry Reedy wrote: > Does go use bytes for text, like most people did in Python 2, a separate > text string class, that hides the internal encoding format and > implementation? In other words, if you do the equivalent of print(s) > where s is a text string with a mixture of greek, cyrillic, hindi, > chinese, japanese, and korean chars, do you see the characters, or some > representation of the internal bytes? The answer is, its complicated. Go has two string types: "strings", and "runes". Strings are equivalent to Python 3 byte-strings, except that the language is biased towards assuming they are UTF-8 instead of Python 3's decision to assume they are ASCII. In other words, if you display a Python 3 byte-string, it will display bytes that represent ASCII characters as ASCII, and everything else escaped as a hex byte: py> b'\x41\xcf\x80\x5a' b'A\xcf\x80Z' Go does the same, except it will display anything which is legal UTF-8 (which may be 1, 2, 3, or 4 bytes) as a Unicode character (actually code point). Assuming your environment is capable of displaying that character, otherwise you'll just see a square, or some other artifact. So if Python used the same rules as Go, the above byte-string would display as: b'A?Z' Most of the time, when processing strings, Go treats them as arbitrary bytes, although Go comes with libraries that help make it easier to work with them as UTF-8 byte strings. Runes, on the other hand, are a strict superset of Unicode. Runes are strings of 32-bit code units, so like UTF-32 except not limited to the Unicode range of \U00000000 through \U0010FFFF. Runes will accept any 32 bit values up to 0xFFFFFFFF. I presume that runes which fall within the UTF-32 range will be displayed as the Unicode character where possible, and those which fall outside of that range as some sort of hex display. So Go strings are like Python byte strings, biased towards UTF-8 but with no guarantees made, and Go runes are a superset of Python text strings. Does that answer your question sufficiently? https://blog.golang.org/strings -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Fri Jul 14 22:33:02 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 15 Jul 2017 12:33:02 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <8760eukhy9.fsf@elektro.pacujo.net> Message-ID: <59697ee0$0$22140$c3e8da3$5496439d@news.astraweb.com> On Sat, 15 Jul 2017 04:10 am, Marko Rauhamaa wrote: > Steve D'Aprano : >> On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote: [...] >>> As it stands, we have >>> >>> ? --[encode>-- Unicode --[reencode>-- UTF-8 >> >> I can't even work out what you're trying to say here. > > I can tell, yet that doesn't prevent you from dismissing what I'm > saying. How am I dismissing it? I didn't reply to it except to say I don't understand it! To me, it looks like gibberish, not even wrong, but rather than say so I thought I'd give you the opportunity to explain what you meant. As the person attempting to communicate, any failure to do so is *your* responsibility, not that of the reader. If you are discussing this in good faith, rather than as a cheap points-scoring exercise, then please try to explain what you mean. >>> Why is one encoding format better than the other? >> >> It depends on what you're trying to do. >> >> If you want to minimize storage and transmission costs, and don't care >> about random access into the string, then UTF-8 is likely the best >> encoding, since it uses as little as one byte per code point, and in >> practice with real-world text (at least for Europeans) it is rarely >> more expensive than the alternatives. > > Python3's strings don't give me any better random access than UTF-8. Say what? Of course they do. Python 3 strings (since 3.3) are a compact form of UTF-32. Without loss of generality, we can say that each string is an array of four-byte code units. (In practice, depending on the string, Python may be able to compact that to one- or two-byte code units.) The critical thing is that slicing and indexing is a constant-time operation. string[i] can just jump straight to offset i code-units into the array. If the code-units are 4 bytes wide, that's just 4*i bytes. UTF-8 is not: it is a variable-width encoding, so there's no way to tell how many bytes it takes to get to string[i]. You have to start at the beginning of the string and walk the bytes, counting code points, until you reach the i-th code point. It may be possible to swap memory for time by building an augmented data structure that makes this easier. A naive example would be to have a separate array giving the offsets of each code point. But then its not a string any more, its a more complex data structure. Go ignores this problem by simply not offering random access to code points in strings. Go simply says that strings are bytes, and if string[i] jumps into the middle of a character (code point), oh well, too bad, so sad. On the other hand, Go also offers a second solution to the problem. Its essentially the same solution that Python offers: a dedicated fixed-width, 32-bit (four byte) Unicode text string type which they call "runes". > Storage and transmission costs are not an issue. I was giving a generic answer to a generic question. You asked a general question, "Why is one encoding format better than the other?" and the general answer to that is *it depends on what you are trying to do*. > It's only that storage and transmission are still defined in terms of bytes. Again, I don't see what point you think you are making here. Ultimately, all our data structures have to be implemented in memory which is addressable in bytes. *All of them* -- objects, linked lists, floats, BigInts, associative arrays, red-black trees, the lot. All of those data structures are presented to the programmer in terms of higher level abstractions. You seem to think that text strings alone don't need that higher level abstraction, and that the programmer ought to think about text in terms of bytes. Why? You entered this discussion with a reasonable position: the text primitives offered to programmers fall short of what we'd like, which is to deal with language in terms of language units: characters specifically. (Let's assume we can decide what a character actually is.) I agree! If Python's text strings are supposed to be an abstraction for "strings of characters", its a leaky abstraction. It's actually "strings of code points". Some people might have said: "Since Python strings fall short of the abstraction we would like, we should build a better abstraction on top of it, using Unicode primitives, that deals with characters (once we decide what they are)." which is where I thought you were going with this. But instead, you've suggested that the solution to the problem: "Python strings don't come close enough to matching the programmer's expectations about characters" is to move *further away* from the programmer's expectations about characters and to have them reason about UTF-8 encoded bytes instead. And then to insult our intelligence even further, after raising the in-memory representation (UTF-8 versus some other encoding) to prominence, you then repeatedly said that the in-memory representation doesn't matter! If it doesn't matter, why do you care whether strings use UTF-8 or UTF-32 or something else? > Python3's strings > force you to encode/decode between strings and bytes for a > yet-to-be-specified advantage. That's simply wrong. You are never forced to encode/decode if you are dealing with strings alone, or bytes alone. You only need to encode/decode when converting between the two. You don't even need to explicitly decode when dealing with file I/O. Provided your files are correctly encoded, Python abstracts away the need to decode and you can just read text out of a file. So your statement is wrong. >> It also has the advantage of being backwards compatible with ASCII, so >> legacy applications that assume all characters are a single byte will >> work if you use UTF-8 and limit yourself to the ASCII-compatible >> subset of Unicode. > > UTF-8 is perfectly backward-compatible with ASCII. No it isn't. ASCII is a 7-bit encoding. No valid ASCII data has the 8th bit set. UTF-8 uses 8 bits, e.g. ? in UTF-8 uses two bytes: \xcf\x80 in hex, which are: 0b11001111 0b10000000 in binary. As you can see, the eighth bit is set in both of those bytes. UTF-8 is only backwards compatible with ASCII if you limit yourself to the ASCII subset of Unicode, i.e. the 128 values between U+0000 and U+007F. >> The disadvantage is that each code point can be one, two, three or >> four bytes wide, and naively shuffling bytes around will invariably >> give you invalid UTF-8 and cause data loss. So UTF-8 is not so good as >> the in-memory representation of text strings. > > The in-memory representation is not an issue. It's the abstract > semantics that are the issue. What? You're asking about *encodings*. By definition, that means you're talking about the in-memory representation. Dear gods man, this is like you asking "Which makes for a better car, gasoline, diesel, LPG, electric or hydrogen?" and then when I start to discuss the differences between the fuels you say "I don't care about the internal differences of the engines, I only care about controls on the dashboard". Marko, it is times like this I think you are trolling, and come really close to just kill-filing you. You explicitly asked about encodings, so I answered your question about encodings. For you to now say that the encoding is irrelevant, well, just stop wasting my time. I don't think you are discussing this in good faith. I think you are arguing to win, no matter how incoherent your argument becomes, so long as you "win" for some definition of winning. I don't have infinite patience for that sort of behaviour. > At the abstract level, we have the text in a human language. Neither > strings nor UTF-8 provide that so we have to settle for something > cruder. I have yet to hear why a string does a better job than UTF-8. This is not even wrong. You are comparing a data structure, string, with a mapping, UTF-8. They aren't alternatives that we get to choose between, like "strings versus ropes" or "UTF-8 versus ISO-8859-3". They are *complementary* not alternatives: we can have strings of UTF-8 encoding text, or strings of ISO-8859-3 bytes, or ropes of UTF-8 encoded text, or ropes of ISO-8859-3 bytes. To give an analogy, you're saying "I have yet to hear why cars do a better job than electric motors." > UTF-16 (used by Windows and Java, for example) is even worse than > strings and UTF-8 because: > > ? --[encode>-- Unicode --[reencode>-- UTF-16 --[reencode>-- bytes Taken at face value, this doesn't make sense. It's just gibberish. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From tjreedy at udel.edu Sat Jul 15 02:27:22 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 15 Jul 2017 02:27:22 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <59696de2$0$1603$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <59696de2$0$1603$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 7/14/2017 9:20 PM, Steve D'Aprano wrote: > On Sat, 15 Jul 2017 07:12 am, Terry Reedy wrote: > >> Does go use bytes for text, like most people did in Python 2, a separate >> text string class, that hides the internal encoding format and >> implementation? In other words, if you do the equivalent of print(s) >> where s is a text string with a mixture of greek, cyrillic, hindi, >> chinese, japanese, and korean chars, do you see the characters, or some >> representation of the internal bytes? > > The answer is, its complicated. > > Go has two string types: "strings", and "runes". > > Strings are equivalent to Python 3 byte-strings, except that the language is > biased towards assuming they are UTF-8 instead of Python 3's decision to assume > they are ASCII. In other words, if you display a Python 3 byte-string, it will > display bytes that represent ASCII characters as ASCII, and everything else > escaped as a hex byte: > > py> b'\x41\xcf\x80\x5a' > b'A\xcf\x80Z' > > Go does the same, except it will display anything which is legal UTF-8 (which > may be 1, 2, 3, or 4 bytes) as a Unicode character (actually code point). > Assuming your environment is capable of displaying that character, otherwise > you'll just see a square, or some other artifact. > > So if Python used the same rules as Go, the above byte-string would display as: > > b'A?Z' > > Most of the time, when processing strings, Go treats them as arbitrary bytes, > although Go comes with libraries that help make it easier to work with them as > UTF-8 byte strings. > > Runes, on the other hand, are a strict superset of Unicode. Runes are strings of > 32-bit code units, so like UTF-32 except not limited to the Unicode range of > \U00000000 through \U0010FFFF. Runes will accept any 32 bit values up to > 0xFFFFFFFF. > > I presume that runes which fall within the UTF-32 range will be displayed as the > Unicode character where possible, and those which fall outside of that range as > some sort of hex display. > > So Go strings are like Python byte strings, biased towards UTF-8 but with no > guarantees made, and Go runes are a superset of Python text strings. > > Does that answer your question sufficiently? > > https://blog.golang.org/strings Yes, thank you. -- Terry Jan Reedy From p_s_d_a_s_i_l_v_a_ns at netcabo.pt Sat Jul 15 02:55:43 2017 From: p_s_d_a_s_i_l_v_a_ns at netcabo.pt (Paulo da Silva) Date: Sat, 15 Jul 2017 07:55:43 +0100 Subject: Cannot access PySide.__version__! Message-ID: Hi! The problem: import PySide print(PySide.__version__) AttributeError: 'module' object has no attribute '__version__' How can I fix this? Other PySide examples seem to work fine! Thanks for any help. Further information: /usr/lib64/python3.4/site-packages/PySide contains only .so files /usr/lib64/python3.4/site-packages/PySide-1.2 contains 2 files: __init__.py _utils.py __init__.py first lines: __all__ = ['QtCore', 'QtGui', 'QtNetwork', 'QtOpenGL', 'QtSql', 'QtSvg', 'QtTest', 'QtWebKit', 'QtScript'] __version__ = "1.2.4" __version_info__ = (1, 2, 4, "final", 0) From p_s_d_a_s_i_l_v_a_ns at netcabo.pt Sat Jul 15 03:00:16 2017 From: p_s_d_a_s_i_l_v_a_ns at netcabo.pt (Paulo da Silva) Date: Sat, 15 Jul 2017 08:00:16 +0100 Subject: Cannot access PySide.__version__! References: Message-ID: ?s 07:55 de 15-07-2017, Paulo da Silva escreveu: > Hi! > > The problem: > > import PySide > print(PySide.__version__) > > AttributeError: 'module' object has no attribute '__version__' > > How can I fix this? > > Other PySide examples seem to work fine! > > Thanks for any help. > > Further information: > /usr/lib64/python3.4/site-packages/PySide contains only .so files > > /usr/lib64/python3.4/site-packages/PySide-1.2 contains 2 files: > __init__.py _utils.py Creating links to __init__.py _utils.py in /usr/lib64/python3.4/site-packages/PySide fixes the problem. From marko at pacujo.net Sat Jul 15 03:50:54 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Sat, 15 Jul 2017 10:50:54 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <8760eukhy9.fsf@elektro.pacujo.net> <59697ee0$0$22140$c3e8da3$5496439d@news.astraweb.com> Message-ID: <871spii1ep.fsf@elektro.pacujo.net> Steve D'Aprano : > On Sat, 15 Jul 2017 04:10 am, Marko Rauhamaa wrote: >> Python3's strings don't give me any better random access than UTF-8. > > Say what? Of course they do. > > Python 3 strings (since 3.3) are a compact form of UTF-32. Without loss of > generality, we can say that each string is an array of four-byte code units. Yes, and a UTF-8 byte array gives me random access to the UTF-8 single-byte code units. Neither gives me random access to the "Grapheme clusters, a.k.a.real characters". For example, the HFS+ file system stores uses a variant of NFD for filenames meaning both UTF-32 and UTF-8 give you random access to pure ASCII filenames only. > UTF-8 is not: it is a variable-width encoding, UTF-32 is a variable-width encoding as well. For example, "baby: medium skin tone" is U+1F476 U+1F3FD: > Go ignores this problem by simply not offering random access to code > points in strings. Random access to code points is as uninteresting as random access to UTF-8 bytes. I might want random access to the "Grapheme clusters, a.k.a.real characters". As you have pointed out, that wish is impossible to grant unambiguously. Marko From rocky at gnu.org Sat Jul 15 06:35:47 2017 From: rocky at gnu.org (rocky) Date: Sat, 15 Jul 2017 03:35:47 -0700 (PDT) Subject: ANN: Python bytecode assembler, xasm Message-ID: <6cf5110e-ac91-4b77-9517-be3f1fae7b9a@googlegroups.com> I may regret this, but there is a very alpha Python bytecode assembler. https://pypi.python.org/pypi/xasm From steve+python at pearwood.info Sat Jul 15 07:08:38 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 15 Jul 2017 21:08:38 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <8760eukhy9.fsf@elektro.pacujo.net> <59697ee0$0$22140$c3e8da3$5496439d@news.astraweb.com> <871spii1ep.fsf@elektro.pacujo.net> Message-ID: <5969f7b9$0$1620$c3e8da3$5496439d@news.astraweb.com> On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: > Steve D'Aprano : > >> On Sat, 15 Jul 2017 04:10 am, Marko Rauhamaa wrote: >>> Python3's strings don't give me any better random access than UTF-8. >> >> Say what? Of course they do. >> >> Python 3 strings (since 3.3) are a compact form of UTF-32. Without loss of >> generality, we can say that each string is an array of four-byte code units. > > Yes, and a UTF-8 byte array gives me random access to the UTF-8 > single-byte code units. Which is irrelevant. Single code units in UTF-8 aren't important. Nobody needs to start a slice in the middle byte of a three byte code point in UTF-8. It's not a useful operation, and allowing slices to occur at arbitrary positions inside UTF-8 sequences means you soon won't have valid UTF-8 any more. Now since I am interested in a good faith discussion, I can even point out something that supports your argument: perhaps we could introduce restrictions on where you can slice, and ensure that they only occur at code point boundaries. So if you try to slice string[100:120], say, what you actually get is string[98:119] because that's where the nearest code point boundaries fall. Or should it move forward? string[101:122], say. Perhaps the Zen of Python is better: when faced with ambiguity, avoid the temptation to guess. We should either prohibit slicing anywhere except on a code point boundary, or better still use a data structure that doesn't expose the internal implementation of code points. Whichever way we go, it doesn't get us any closer to our ultimate aim, which is a text data type based on graphemes rather than code points. All it does is give us what Python's unicode strings already give us: code points. So what does that extra complexity forced on us by UTF-8 give us, apart from a headache? Why use UTF-8? > Neither gives me random access to the "Grapheme clusters, a.k.a.real > characters". For example, the HFS+ file system stores uses a variant of > NFD for filenames meaning both UTF-32 and UTF-8 give you random access > to pure ASCII filenames only. And they're not graphemes either. Normalisation doesn't give you graphemes. It's ironic that you give the example of Apple using NFD, since that makes the problem you are railing against *worse* rather than better. Decomposition has its uses, but the specific problem this thread started with is made worse due to decomposition. >> UTF-8 is not: it is a variable-width encoding, > > UTF-32 is a variable-width encoding as well. No it isn't. All code points are exactly one four-byte code unit in size. > For example, "baby: medium skin tone" is U+1F476 U+1F3FD: That's two code points, not one. Variation selectors present the same issues as combining characters. > > >> Go ignores this problem by simply not offering random access to code >> points in strings. > > Random access to code points is as uninteresting as random access to > UTF-8 bytes. I have random access to code points in Python right now, and I use it all the time to extract code points and even build up new strings from slices. I wouldn't do that with UTF-8 bytes, it's too bloody hard. > I might want random access to the "Grapheme clusters, a.k.a.real > characters". That would be nice to have, but the truth is that for most coders, Unicode code points are the low-hanging fruit that get you 95% of the way, and for many applications that's "close enough". Support for the Unicode grapheme breaking algorithm would get you probably 90% of the rest of the way. And then some sort of configurable system where defaults were based on the locale would probably get you a fairly complete grapheme-based text library. I'm interested in such a thing. That's why I pointed out the issue on the bug tracker, to try to garner interest in it. As far as I can tell, you seem to be more interested in cheap point scoring, digs against Unicode, and an insistence that UTF-8 is better than strings (which doesn't even make sense). > As you have pointed out, that wish is impossible to grant > unambiguously. I never said that. Just because it is *difficult*, and that no one answer will satisfy everyone all of the time, doesn't mean we can't solve the problem. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From lepto.python at gmail.com Sat Jul 15 07:35:56 2017 From: lepto.python at gmail.com (oyster) Date: Sat, 15 Jul 2017 19:35:56 +0800 Subject: is @ operator popular now? Message-ID: as the title says. has @ been used in projects? From m at funkyhat.org Sat Jul 15 08:05:46 2017 From: m at funkyhat.org (Matt Wheeler) Date: Sat, 15 Jul 2017 12:05:46 +0000 (UTC) Subject: is @ operator popular now? In-Reply-To: References: Message-ID: On Sat, 15 Jul 2017, 12:35 oyster, wrote: > as the title says. has @ been used in projects? > Strictly speaking, @ is not an operator. It delimits a decorator statement (in python statements and operations are not the same thing). However, to answer the question you actually asked, yes, all the time. For specific examples, see: pytest's fixtures contextlib.contextmanager (makes creating context managers mich simpler in most cases) @property @classmethod etc. etc. (I sometimes see these used a bit too freely, when a plain attribute or a function at the module level would be more appropriate) > -- -- Matt Wheeler http://funkyh.at From christian at python.org Sat Jul 15 08:49:06 2017 From: christian at python.org (Christian Heimes) Date: Sat, 15 Jul 2017 14:49:06 +0200 Subject: is @ operator popular now? In-Reply-To: References: Message-ID: On 2017-07-15 14:05, Matt Wheeler wrote: > On Sat, 15 Jul 2017, 12:35 oyster, wrote: > >> as the title says. has @ been used in projects? >> > > Strictly speaking, @ is not an operator. > It delimits a decorator statement (in python statements and operations are > not the same thing). > However, to answer the question you actually asked, yes, all the time. @ is an actual operator in Python. It was added in Python 3.5 as infix matrix multiplication operator, e.g. m3 = m1 @ m2 The operator is defined in PEP 465, https://www.python.org/dev/peps/pep-0465/ Christian From __peter__ at web.de Sat Jul 15 09:05:24 2017 From: __peter__ at web.de (Peter Otten) Date: Sat, 15 Jul 2017 15:05:24 +0200 Subject: is @ operator popular now? References: Message-ID: Matt Wheeler wrote: >> as the title says. has @ been used in projects? numpy, probably? > Strictly speaking, @ is not an operator. In other words it's not popular, not even widely known. Compare: $ python3.4 -c '__peter__ at web.de' File "", line 1 __peter__ at web.de ^ SyntaxError: invalid syntax $ python3.5 -c '__peter__ at web.de' Traceback (most recent call last): File "", line 1, in NameError: name '__peter__' is not defined Starting with 3.5 my email address is valid Python syntax. Now I'm waiting for the __peter__ builtin ;) From rosuav at gmail.com Sat Jul 15 09:11:35 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 15 Jul 2017 23:11:35 +1000 Subject: is @ operator popular now? In-Reply-To: References: Message-ID: On Sat, Jul 15, 2017 at 11:05 PM, Peter Otten <__peter__ at web.de> wrote: > Matt Wheeler wrote: > >>> as the title says. has @ been used in projects? > > numpy, probably? > >> Strictly speaking, @ is not an operator. > > In other words it's not popular, not even widely known. > > Compare: > > $ python3.4 -c '__peter__ at web.de' > File "", line 1 > __peter__ at web.de > ^ > SyntaxError: invalid syntax > $ python3.5 -c '__peter__ at web.de' > Traceback (most recent call last): > File "", line 1, in > NameError: name '__peter__' is not defined > > Starting with 3.5 my email address is valid Python syntax. Now I'm waiting > for the __peter__ builtin ;) And you'll have to 'import web' too. I've no idea what 'web.de' would be and what happens when you matmul it by you. ChrisA From marko at pacujo.net Sat Jul 15 10:01:21 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Sat, 15 Jul 2017 17:01:21 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <8760eukhy9.fsf@elektro.pacujo.net> <59697ee0$0$22140$c3e8da3$5496439d@news.astraweb.com> <871spii1ep.fsf@elektro.pacujo.net> <5969f7b9$0$1620$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87shhxu7da.fsf@elektro.pacujo.net> Steve D'Aprano : > On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: >> I might want random access to the "Grapheme clusters, a.k.a.real >> characters". > > That would be nice to have, but the truth is that for most coders, > Unicode code points are the low-hanging fruit that get you 95% of the > way, and for many applications that's "close enough". I think "close enough" is actually dangerous. We shouldn't encourage that practice. > Support for the Unicode grapheme breaking algorithm would get you > probably 90% of the rest of the way. And then some sort of > configurable system where defaults were based on the locale would > probably get you a fairly complete grapheme-based text library. Yes, that kind of a text class would be useful. > I'm interested in such a thing. That's why I pointed out the issue on > the bug tracker, to try to garner interest in it. As far as I can > tell, you seem to be more interested in cheap point scoring, digs > against Unicode, and an insistence that UTF-8 is better than strings > (which doesn't even make sense). It does seem to me UTF-8 is a better waiting position than strings. Strings give you more trouble while not truly solving any problems. Marko From __peter__ at web.de Sat Jul 15 10:03:06 2017 From: __peter__ at web.de (Peter Otten) Date: Sat, 15 Jul 2017 16:03:06 +0200 Subject: is @ operator popular now? References: Message-ID: Chris Angelico wrote: > On Sat, Jul 15, 2017 at 11:05 PM, Peter Otten <__peter__ at web.de> wrote: >> Matt Wheeler wrote: >> >>>> as the title says. has @ been used in projects? >> >> numpy, probably? >> >>> Strictly speaking, @ is not an operator. >> >> In other words it's not popular, not even widely known. >> >> Compare: >> >> $ python3.4 -c '__peter__ at web.de' >> File "", line 1 >> __peter__ at web.de >> ^ >> SyntaxError: invalid syntax >> $ python3.5 -c '__peter__ at web.de' >> Traceback (most recent call last): >> File "", line 1, in >> NameError: name '__peter__' is not defined >> >> Starting with 3.5 my email address is valid Python syntax. Now I'm >> waiting for the __peter__ builtin ;) > > And you'll have to 'import web' too. > > I've no idea what 'web.de' would be and what happens when you matmul it by > you. > > ChrisA This is getting more complex than expected. Here's a prototype: import builtins def __peter__(): class Provider: def __init__(self, name): self.name = name def __getattr__(self, name): return Provider(f"{self.name}.{name}") def __rmatmul__(self, user): assert user.email.endswith("@" + self.name) return user class User: def __init__(self, email): self.email = email user, at, site = email.partition("@") name = site.partition(".")[0] setattr(builtins, name, Provider(name)) def __repr__(self): return self.email return User("__peter__ at web.de") builtins.__peter__ = __peter__() del __peter__ $ python3.7 -i web.py >>> __peter__ at web.de __peter__ at web.de I'm sure you won't question the feature's usefulness after this. Future versions may send me an email or wipe your hard disk at my discretion... From rantingrickjohnson at gmail.com Sat Jul 15 10:04:21 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 07:04:21 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: <9db4f037-876b-45ce-b5c1-4a1a2231376c@googlegroups.com> On Friday, July 14, 2017 at 2:40:43 AM UTC-5, Chris Angelico wrote: [...] > IMO the Python str type is adequate as a core data type. What we may > need, though, is additional utility functions, eg: > > * unicodedata.grapheme_clusters(str) - split str into a sequence of > grapheme clusters > * pango.get_text_extents(str) - measure the pixel dimensions of a line of text > * platform.punish_user() - issue a platform-dependent response (such > as an electric shock, a whack with a 2x4, or a dropped anvil) on > someone who has just misunderstood Unicode again > * socket.punish_user() - as above, but to the user at the opposite end > of a socket Chris's violent nature is obviously due to him watching so many looney tunes episodes, that he believes an anvil to the head causes no damage. This is not a cartoon Chris! From rantingrickjohnson at gmail.com Sat Jul 15 10:08:12 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 07:08:12 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: On Friday, July 14, 2017 at 2:40:43 AM UTC-5, Chris Angelico wrote: > [...] > What is the length of a string? How often do you actually > care about the number of grapheme clusters - and not, for > example, about the pixel width? (To columnate text, for > instance, you need to know about its width in pixels or > millimeters, not the number of characters in the line.) Not in the case of a fixed width font! > And if you're going to group code points together because > some of them are combining characters, would you also group > them together because there's a zero-width joiner in the > middle? The answer will sometimes be "yes of course" and > sometimes "of course not". Consistency is the key. And we must remember that he who assembled such inconsistent strings can only blame herself. From rantingrickjohnson at gmail.com Sat Jul 15 10:31:35 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 07:31:35 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Friday, July 14, 2017 at 12:43:50 PM UTC-5, Steve D'Aprano wrote: > Before you answer, does your answer apply to Arabic and > Thai as well as Western European languages? I find it interesting that those who bellyache the loudest about the "inclusivity of regional charator encodings" never dabble much outside their _own_ basic English set. For instance: I never hear Chinese or eastern Europeans bellyaching about how ASCII forced them to use a standard keyboard and denied them the "gawd given right" to become an amatuer space cadet[1]! Nope, they just learn English and move on. > [...] > > As for the legacy encodings: > > - they're not 7-bit clean, except for ASCII; > > - some of them are variable-width; > > - none of them support the full range of Unicode, so they > aren't universal character sets; > > - in other words, you either resign yourself to being > unable to exchange documents with other people, resign > yourself to dealing with moji-bake, or invent some complex > and non-backwards-compatible in-band mechanism for > switching charsets; > > - they suffer from the exact same problems as Unicode > regarding the distinction between code points and > graphemes; > > - so not only do they lack the advantages of Unicode, but > they have even more disadvantages. Thanks for finally admitting that Unicode is not the cure all that you unicode cultist make it out to be. [1] Possibly with the exception of Xan Lee. ;-). BTW, what happened to the old chap? From rosuav at gmail.com Sat Jul 15 10:31:38 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 00:31:38 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87shhxu7da.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <8760eukhy9.fsf@elektro.pacujo.net> <59697ee0$0$22140$c3e8da3$5496439d@news.astraweb.com> <871spii1ep.fsf@elektro.pacujo.net> <5969f7b9$0$1620$c3e8da3$5496439d@news.astraweb.com> <87shhxu7da.fsf@elektro.pacujo.net> Message-ID: On Sun, Jul 16, 2017 at 12:01 AM, Marko Rauhamaa wrote: > Steve D'Aprano : > >> On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: >>> I might want random access to the "Grapheme clusters, a.k.a.real >>> characters". >> >> That would be nice to have, but the truth is that for most coders, >> Unicode code points are the low-hanging fruit that get you 95% of the >> way, and for many applications that's "close enough". > > I think "close enough" is actually dangerous. We shouldn't encourage > that practice. > >> Support for the Unicode grapheme breaking algorithm would get you >> probably 90% of the rest of the way. And then some sort of >> configurable system where defaults were based on the locale would >> probably get you a fairly complete grapheme-based text library. Okay. So here's your challenge: don't get "close enough", get perfect. Divide the following strings into "characters" by your definition; give me a list of one-character strings. Make sure you are perfect and consistent. I'll start with an easy one. 1) "Gi?\u00A0ra\u00A0?i, m?t\u00A0m?nh\u00A0ta" 2) "?????, ?????" 3) "????? ???" 4) "??????????" 5) "? ?? ?? ?????" Your locale, should this matter, is your choice of en_AU.utf8, en_US.utf8, tr_TR.utf8, or sv_SE.utf8. In case the information is lost in transmission, here are the same strings, as sequences of codepoints. 1) U+0047 U+0069 U+1EDD U+00A0 U+0072 U+0061 U+00A0 U+0111 U+0069 U+002C U+0020 U+006D U+1ED9 U+0074 U+00A0 U+006D U+00EC U+006E U+0068 U+00A0 U+0074 U+0061 2) U+05DC U+05E2 U+05D6 U+05D5 U+05D1 U+002C U+0020 U+05DC U+05E2 U+05D6 U+05D5 U+05D1 3) U+0627 U+0637 U+0644 U+0642 U+064A U+0020 U+0633 U+0631 U+0643 4) U+300C U+5225 U+8B93 U+4ED6 U+5011 U+9032 U+4F86 U+770B U+898B U+300D 5) U+B2E4 U+0020 U+C78A U+C5B4 U+0020 U+1103 U+1161 U+0020 U+110B U+1175 U+11BD U+110B U+1165 Once this is solved, you can propose adding an iteration function that follows these rules. Probably to the unicodedata module, although it'd most likely have to go via PyPI first. ChrisA From rosuav at gmail.com Sat Jul 15 10:49:57 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 00:49:57 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: On Sun, Jul 16, 2017 at 12:08 AM, Rick Johnson wrote: > On Friday, July 14, 2017 at 2:40:43 AM UTC-5, Chris Angelico wrote: >> [...] >> What is the length of a string? How often do you actually >> care about the number of grapheme clusters - and not, for >> example, about the pixel width? (To columnate text, for >> instance, you need to know about its width in pixels or >> millimeters, not the number of characters in the line.) > > Not in the case of a fixed width font! Yes, of course. How silly of me. Hold on, let me just grab my MUD client, which is already using a fixed width font... Here's a piece of text, copied and pasted straight from the client. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ??????? ????? ?? ?????? ??????? ???? ?? ????? ??, ?? ???? U+1680 is "?" U+200B is "" U+180E is "?" ? ?? ?? ????? -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- And here's how it renders. http://imgur.com/1xTT1s0 It's so easy! Monospaced fonts solve everything. Every single character gets the exact same number of pixels of width, because that's how the standard stipulates it. >> And if you're going to group code points together because >> some of them are combining characters, would you also group >> them together because there's a zero-width joiner in the >> middle? The answer will sometimes be "yes of course" and >> sometimes "of course not". > > Consistency is the key. And we must remember that he who > assembled such inconsistent strings can only blame herself. Except that it's the same string in different contexts. There is no inconsistency in the string. ChrisA From steve+python at pearwood.info Sat Jul 15 11:04:12 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sun, 16 Jul 2017 01:04:12 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <8760eukhy9.fsf@elektro.pacujo.net> <59697ee0$0$22140$c3e8da3$5496439d@news.astraweb.com> <871spii1ep.fsf@elektro.pacujo.net> <5969f7b9$0$1620$c3e8da3$5496439d@news.astraweb.com> <87shhxu7da.fsf@elektro.pacujo.net> Message-ID: <596a2eee$0$1597$c3e8da3$5496439d@news.astraweb.com> On Sun, 16 Jul 2017 12:01 am, Marko Rauhamaa wrote: > It does seem to me UTF-8 is a better waiting position than strings. > Strings give you more trouble while not truly solving any problems. /face-palm Okay, that's it, this conversation is over. You have no clue what you are talking about. http://rationalwiki.org/wiki/Not_even_wrong http://rationalwiki.org/wiki/Category_mistake -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From jonathan.blanck89 at gmail.com Sat Jul 15 12:40:24 2017 From: jonathan.blanck89 at gmail.com (jonathan.blanck89 at gmail.com) Date: Sat, 15 Jul 2017 09:40:24 -0700 (PDT) Subject: "Edit with IDLE" doesn't work any more ? In-Reply-To: References: Message-ID: <92d7c734-d81e-4427-8160-113f82a92068@googlegroups.com> Am Freitag, 28. April 2017 14:48:22 UTC+2 schrieb Yip, Kin: > Hi, > > I've finally known why .... By chance, I went to the installation directory : C:\Program Files\Python36\Lib\tkinter > > to check on files. I did "EDIT with IDLE" on any files there. It all works ! Then, I went back to my directory > where I put all my personal .py codes. It didn't work there. Finally, I've guessed and realized/tested that > "EDIT with IDLE" doesn't work in my python directory because I have just recently made a file called : > > tkinter.py > > > Somehow, this stops "EDIT with IDLE" from working if I try to "EDIT with IDLE" on any files in that directory/folder. > > After I rename it to mytkinter.py , things work normally now ! > > Weird ! Don't know exactly why ...?! > > Sorry to bother you guys ... > > Kin you da real MVP! From m at funkyhat.org Sat Jul 15 13:09:05 2017 From: m at funkyhat.org (Matt Wheeler) Date: Sat, 15 Jul 2017 17:09:05 +0000 (UTC) Subject: is @ operator popular now? In-Reply-To: References: Message-ID: On Sat, 15 Jul 2017, 13:49 Christian Heimes, wrote: > @ is an actual operator in Python. It was added in Python 3.5 as infix > matrix multiplication operator, e.g. > > m3 = m1 @ m2 > TIL The operator is defined in PEP 465, > https://www.python.org/dev/peps/pep-0465/ Perhaps it should also be listed at https://docs.python.org/3.6/genindex-Symbols.html -- -- Matt Wheeler http://funkyh.at From gbs.deadeye at gmail.com Sat Jul 15 13:59:33 2017 From: gbs.deadeye at gmail.com (=?UTF-8?Q?Andre_M=C3=BCller?=) Date: Sat, 15 Jul 2017 17:59:33 +0000 Subject: pyserial and end-of-line specification In-Reply-To: References: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> Message-ID: Just take a look into the documentation: https://docs.python.org/3/library/io.html#io.TextIOWrapper And in the example of Pyserial: http://pyserial.readthedocs.io/en/latest/shortintro.html#eol I think it shold be: sio = io.TextIOWrapper(io.BufferedRWPair(ser, ser), newline='yourline_ending') But the documentation of Pytho says: Warning BufferedRWPair does not attempt to synchronize accesses to its underlying raw streams. You should not pass it the same object as reader and writer; use BufferedRandom instead. Maybe you should also try: sio = io.TextIOWrapper(io.BufferedRandom(ser), newline='yourline_ending') If it's readonly: sio = io.TextIOWrapper(io.BufferedReader(ser), newline='yourline_ending') I never tried it, but your question leads me to take a look into this cool features of the io module. Greetings Andre From mikhailwas at gmail.com Sat Jul 15 18:38:42 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Sun, 16 Jul 2017 00:38:42 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: > Random access to code points is as uninteresting as random access to > UTF-8 bytes. > I might want random access to the "Grapheme clusters, a.k.a.real > characters". What _real_ characters are you referring to? If your data has "?" (U00E1), then it is one real character, if you have "a" (U0061) and "?" (U02CA) then it is _two_ real characters. So in both cases you have access to code points = real characters. For metaphysical discussion - in _my_ definition there is no such "real" character as "?", since it is the "a" glyph with some dirt, so according to my definition, it should be two separate characters, both semantically and technically seen. And, in my definition, the whole Unicode is a huge junkyard, to start with. But opinions may vary, and in case you prefer or forced to write "?", then it can be impractical to store it as two characters, regardless of encoding. Mikhail From greg.ewing at canterbury.ac.nz Sat Jul 15 19:50:06 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Sun, 16 Jul 2017 11:50:06 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: Chris Angelico wrote: > Hold on, let me just grab my MUD > client, which is already using a fixed width font... > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > ??????? > ????? ?? ?????? ??????? > ???? ?? ????? ??, ?? ???? > U+1680 is "?" > U+200B is "" > U+180E is "?" > ? ?? ?? ????? > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- I suspect that different lines in that example are actually being rendered in different fonts. Characters within the *same* monospaced font should have the same width (otherwise it's not really a monospaced font!), but there are no guarantees between different fonts. Perhaps the meta-problem here is that Unicode being so big has made it impractical to have a single font that encompasses all the characters you might ever want to render, so you often have to make do with a hodgepodge of fonts that don't play well together. -- Greg From steve+python at pearwood.info Sat Jul 15 20:26:39 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sun, 16 Jul 2017 10:26:39 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> Message-ID: <596ab2c1$0$1589$c3e8da3$5496439d@news.astraweb.com> On Sun, 16 Jul 2017 12:31 am, Rick Johnson wrote: > I never hear Chinese or eastern Europeans > bellyaching Do you speak much to Chinese and Eastern Europeans who don't speak or write English? How would you know what they say? "All toup?es are bad. I've never seen a good one that looked real." http://rationalwiki.org/wiki/Toupee_fallacy -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Sat Jul 15 20:28:40 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 10:28:40 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: On Sun, Jul 16, 2017 at 9:50 AM, Gregory Ewing wrote: > Chris Angelico wrote: >> >> Hold on, let me just grab my MUD >> client, which is already using a fixed width font... >> >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> ??????? >> ????? ?? ?????? ??????? >> ???? ?? ????? ??, ?? ???? >> U+1680 is "?" >> U+200B is "" >> U+180E is "?" >> ? ?? ?? ????? >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > > I suspect that different lines in that example are actually > being rendered in different fonts. Characters within the *same* > monospaced font should have the same width (otherwise it's not > really a monospaced font!), but there are no guarantees between > different fonts. > > Perhaps the meta-problem here is that Unicode being so big has > made it impractical to have a single font that encompasses all > the characters you might ever want to render, so you often have > to make do with a hodgepodge of fonts that don't play well > together. That could explain some of it. However, Chinese characters have a well-defined space which is significantly wider than most monospaced fonts would use for Latin characters, so it would look ugly for most text in Western European languages. Also, that doesn't deal with U+200B or U+180E, which have well-defined widths *smaller* than typical Latin letters. (200B is a zero-width space. Is it a character?) Hebrew text is rendered right-to-left, which makes columnar alignment *very* interesting. Arabic text, in addition to being RTL, is written in a joined/running style, so individual letters aren't rendered the same way that an entire word is. And in the Korean example, half the glyphs are represented as composed syllables (U+B2E4 HANGUL SYLLABLE DA) and half are decomposed letters (U+1103 HANGUL CHOSEONG TIKEUT followed by U+1161 HANGUL JUNGSEONG A). These are not combining characters - they are legitimate characters in their own right. (At least, I can't find anything in the Unicode data files that indicates that they aren't letters. I can use them individually in Python identifiers, for instance.) So even if someone were to create a single font with every Unicode character represented, it couldn't actually give every character the same width, because that would result in incorrect rendering for many scripts. ChrisA From rantingrickjohnson at gmail.com Sat Jul 15 21:20:30 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 18:20:30 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> Message-ID: <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> On Saturday, July 15, 2017 at 7:29:14 PM UTC-5, Chris Angelico wrote: > [...] Also, that doesn't deal with > U+200B or U+180E, which have well-defined widths *smaller* than > typical Latin letters. (200B is a zero-width space. Is it a > character?) Of *COURSE* it's a character. Would you also consider 0 not to be a number? Sheesh! When call the `len()` function on a string containing only three "zero-width unicode chars", i want `len` to return the integer 3 not 0! In what upside-down/inside-out universe would you prefer that `len` lie to you and return 0? You can't be serious... Doth not a string containing three characters have a length of 3? And if not, what other length could it have? Doth not a knapsack containing 3 items have a quantity of 3? And if not, what other quantity could it have? You seem to want this fine group to believe that if the 3 items in the knapsack are _visible_ to the naked eye (say, three apples), then they are relevant to the quantity. But what if the three objects in the knapsack are, say, radiowaves -- yep, three radiowaves bouncing around inside a knapsack -- are we to believe that the knapsack is empty? And if we are, then every scientist and mathematician since antiquity shall be rolling over in their graves. Furthermore, why should the storage API and the display API give a monkey's toss about the other, when they are obviously "two sides of a mountain". From rosuav at gmail.com Sat Jul 15 21:32:16 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 11:32:16 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> Message-ID: On Sun, Jul 16, 2017 at 11:20 AM, Rick Johnson wrote: > On Saturday, July 15, 2017 at 7:29:14 PM UTC-5, Chris Angelico wrote: >> [...] Also, that doesn't deal with >> U+200B or U+180E, which have well-defined widths *smaller* than >> typical Latin letters. (200B is a zero-width space. Is it a >> character?) > > Of *COURSE* it's a character. > > Would you also consider 0 not to be a number? > > Sheesh! Exactly. That's my point. Even in a monospaced font, U+200B is a character, yet it is by rule a zero-width character. So even in a monospaced font, some characters must vary in width. ChrisA From python at mrabarnett.plus.com Sat Jul 15 21:54:06 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 16 Jul 2017 02:54:06 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> Message-ID: <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> On 2017-07-16 02:20, Rick Johnson wrote: > On Saturday, July 15, 2017 at 7:29:14 PM UTC-5, Chris Angelico wrote: >> [...] Also, that doesn't deal with >> U+200B or U+180E, which have well-defined widths *smaller* than >> typical Latin letters. (200B is a zero-width space. Is it a >> character?) > > Of *COURSE* it's a character. > > Would you also consider 0 not to be a number? > > Sheesh! > [snip] You need to be careful about the terminology. Is linefeed a character? You might call it a "control character", but it's not really a _character_, it's control/format _code_. Is an acute accent a character? No, it's a diacritic mark that's added to a character. When you're working with Unicode strings, you're not working with strings of characters as such, but with strings of 'codepoints', some of which are characters, others combining marks, yet others format codes, and so on. From rantingrickjohnson at gmail.com Sat Jul 15 22:01:32 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 19:01:32 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596ab2c1$0$1589$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> <5968f65b$0$1585$c3e8da3$5496439d@news.astraweb.com> <596ab2c1$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: <6786c451-73ca-43f7-94d6-abc338a2ca3c@googlegroups.com> On Saturday, July 15, 2017 at 7:55:46 PM UTC-5, Steve D'Aprano wrote: > On Sun, 16 Jul 2017 12:31 am, Rick Johnson wrote: > > > I never hear Chinese or eastern Europeans > > bellyaching > > Do you speak much to Chinese and Eastern Europeans who > don't speak or write English? How would you know what they > say? > > "All toup?es are bad. I've never seen a good one that looked real." > > http://rationalwiki.org/wiki/Toupee_fallacy A good retort! But not airtight, i'm afraid. Here, allow me to explain... The implication of the Toupee Fallacy is that one cannot ever discover a "good toupee", since "good toupees" would be indistinguishable from _real_ hair. Which is true, however, the Toupee Fallacy also applies inversely... What i mean is that your implicit implication that i am unable to discover "good toupees", and therefore unable to quantify them, also applies to your inability to prove that "Good Toupees" even exist. Sure, we can _assume_ that "Good Toupees" exist, but such a conjecture would never be _scientific_. Therefore, the Toupee Fallacy is invalid as a weapon of debate because it relies on the unproved premise that "Good Toupees" even exist. Isn't that ironic? Dontcha think? [1] Save that the experimenter yanked on the hair of every person she encountered, which of course is not polite, so we will safely assume that such techniques, while arguably 100% scientific, were not used during the "Toupee Fallacy Study". Which incidentally is why the media never dubbed it the "Toupee Terror Attacks". But i digress... From ben+python at benfinney.id.au Sat Jul 15 22:33:10 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 16 Jul 2017 12:33:10 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> Message-ID: <854lud861l.fsf@benfinney.id.au> MRAB writes: > You need to be careful about the terminology. Definitely agreed. > > Is linefeed a character? You might call it a "control character", but > it's not really a _character_, it's control/format _code_. And yet the ASCII and Unicode standard says code point 0x0A (U+000A LINE FEED) is a character, by definition. Rather than saying ?no, it's not a character?, I think a more accurate statement would be: a linefeed *is* a character in ASCII, but that doesn't mean every other standard must agree. Indeed it may be better to say: a line feed is a character and is also a control code. > Is an acute accent a character? Yes, according to Unicode. ??? (U+0301 ACUTE ACCENT) is a character. > No, it's a diacritic mark that's added to a character. Lose the ?no?, and I agree. The acute accent is a character and *also* is a diacritic mark that is added to a character. Unicode categorises U+0301 is a character in the categories ?symbol? and ?modifier?. Note that those are not exclusive. It's entirely reasonable for a concept to fit in multiple categories simultaneously. What is being revealed in this discussion is the folly of insisting on exclusive categories for everything, and that terms must have exactly one meaning. You are correct that we need to be clear which definition is being used. But we cannot thereby say that other, different, definitions are *necessarily* wrong. That is an extra claim that would need to be demonstrated, and the mere fact of the difference is not sufficient. -- \ ?It's dangerous to be right when the government is wrong.? | `\ ?Francois Marie Arouet Voltaire | _o__) | Ben Finney From rantingrickjohnson at gmail.com Sun Jul 16 00:25:20 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 21:25:20 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> Message-ID: <84c8171e-9736-400a-b411-e0dadc61e76a@googlegroups.com> On Saturday, July 15, 2017 at 8:54:40 PM UTC-5, MRAB wrote: > You need to be careful about the terminology. You are correct. I admit I was a little loose with my terms there. > Is linefeed a character? Since LineFeed is the same as NewLine, then yes, IMO, linefeed is a character. > You might call [linefeed] a "control character", but it's > not really a _character_, it's control/format _code_. True. Allow me try and define some concrete terms that we can use. In the old days, long before i was born, and even long before i downloaded my first compiler (ah the memories!), the concept of strings was so much simpler. Yep, back in those days all you had was, basically, two discreate sub components of a string: the "actual chars" and the "virtual chars". (Disambiguation) The "actual chars"[1] are any chars that a programmer could insert by pressing a single key on the keyboard, such as: "1", "2", "3", "a", "b", "c" , "!", "@", "#" -- etc.. The "virtual chars" -- or the "control codes" as you put it (the ones that start with a "\") -- are the chars that represent "structural elements" of the string (f.i. \n, \t, etc..). But in reality, the implementation of strings has complicated the idea of "virtual chars as solely structural elements" of the display, by including such absurdities as: (1) Sounds ("\a") (2) Virtual interactions such as: BackSpace("\b"), CarrigeReturn ("\r") and FormFeed ("\f") intermixed with control codes that constitute _actual_ structural elements such as: (1) LineFeed or NewLine ("\n") (2) HorizontalTab ("\t") (3) VericalTab ("\v") And a few other non-structural codes that allow embedding delimiters or hex or octals. And furthermore, two distinct "realms", if i may, in which a string can exist: the "virtual character realm" and the "display realm". (Disambiguation) The "virtual character realm" is sort of like an operating room where a doctor (aka: programmer) performs operations on the patient (aka: string), or if you like, a castle where a mad scientist builds a Unicode monster from a hodgepodge of body parts he stole from local grave yards and is later lynched by a mob of angry peasants for his perceived sins against nature. But i digress... Whereas the "display realm" is sort of like an awards ceremony for celebrities, except here, strings take the place of strung-out celebs and characters are dressed in the over-hyped rags (aka: font) of an overpaid fashion designer . But the two "realms" and two "character types" are but only a small sample of the syntactical complexity of Python strings. For we haven't even discussed the many types of string literals that Python defines. Some include: (1) "Normal Strings" (2) r"Raw Strings (3) b"Byte Strings" (4) u"Unicode Strings" (5) ru"Raw Unicode" (6) ur'Unicode "that is _raw_"' (7) f"Format literals" ... Whew! IMO, I think the reason why the implementation of strings has been such a tough nut to crack (Python3000 notwithstanding), is due very much to what i call a "syntactical circus". > Is an acute accent a character? No, it's a diacritic mark > that's added to a character. And i agree. Chris was arguing that zero width spaces should not be counted as characters when the `len()` function is applied to the string, for which i disagree on the basis of consistency. My first reaction is: "Why would you inject a char into a string -- even a zero-width char! -- and then expect that the char should not affect the length of the string as returned by `len`?" Being that strings (on the highest level) are merely linear arrays of chars, such an assumption defies all logic. Furthermore, the length of a string (in chars) and the "perceived" length of a string (when rendered on a screen, or printed on paper), are in no way relevant to one another. When we, as programmers, are manipulateing strings (slicing, munging, concatenating, etc..) our only concern should be that _every_ char is accessable, indexable, quantifiable and will maintain its order. And whether or not a char will be visible, when rendered on a screen or paper, is irrelevant to these "programmer centric" operations. Rendering is the domain of graphic designers, not software developers. > When you're working with Unicode strings, you're not > working with strings of characters as such, but with > strings of 'codepoints', some of which are characters, > others combining marks, yet others format codes, and so on. Which is unfortunate for the programmer. Who would like to get things done without a viscous implementation mucking up the gears. [1] Of course, even in the realms of ASCII, there are chars that cannot be inserted by the programmer _simply_ by pressing a single key on the keyboard. But most of these chars were useless anyways. So we will ignore this small detail for now. One point to mention is that Unicode greatly increased the number of useless chars. From rustompmody at gmail.com Sun Jul 16 00:33:28 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Sat, 15 Jul 2017 21:33:28 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: Message-ID: On Sunday, July 16, 2017 at 4:09:16 AM UTC+5:30, Mikhail V wrote: > On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: > > Random access to code points is as uninteresting as random access to > > UTF-8 bytes. > > I might want random access to the "Grapheme clusters, a.k.a.real > > characters". > > What _real_ characters are you referring to? > If your data has "?" (U00E1), then it is one real character, > if you have "a" (U0061) and "?" (U02CA) then it is _two_ > real characters. So in both cases you have access to code points = > real characters. Right now in an adjacent mailing list (debian) I see someone signed off with a gr?? I guess the third character is a u with some ?dirt? Whats the fourth? > > For metaphysical discussion - in _my_ definition there s/metaphysical/linguistic > is no such "real" character as "?", since it is the "a" glyph with some dirt, > so according to my definition, it should be two separate characters, > both semantically and technically seen. > > And, in my definition, the whole Unicode is a huge junkyard, to start with. > > But opinions may vary, and in case you prefer or forced to write "?", > then it can be impractical to store it as two characters, regardless of > encoding. Heck even in the English that I learnt in school we had ?gis, hom?opath etc And just now looking up: https://en.wikipedia.org/wiki/List_of_words_that_may_be_spelled_with_a_ligature I see economics is ?conomics!! Seriously the word "ligature" like the word "grapheme" is misleading Its not a graphical or typographic notion its an atom of the language's lexicon No Hindi speaker seeing ? + ? = ?? calls the last anything but a letter And the vowel sign ? is never first class a vowel From rantingrickjohnson at gmail.com Sun Jul 16 00:52:41 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 21:52:41 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> <854lud861l.fsf@benfinney.id.au> Message-ID: <4d03b1b1-0790-4e9d-958b-9fd787e2c9d3@googlegroups.com> On Saturday, July 15, 2017 at 9:33:49 PM UTC-5, Ben Finney wrote: > MRAB writes: [...] > > Is linefeed a character? You might call it a "control > > character", but it's not really a _character_, it's > > control/format _code_. > > And yet the ASCII and Unicode standard says code point 0x0A > (U+000A LINE FEED) is a character, by definition. Rather > than saying ?no, it's not a character?, I think a more > accurate statement would be: a linefeed *is* a character in > ASCII, but that doesn't mean every other standard must > agree. Indeed it may be better to say: a line feed is a > character and is also a control code. > > > Is an acute accent a character? > > Yes, according to Unicode. ??? (U+0301 ACUTE ACCENT) is a > character. > > > No, it's a diacritic mark that's added to a character. > > Lose the ?no?, and I agree. So you would be happy with a string containing a single character that was _decorated_ with a single accent mark (say, for instance U+00E3 (Latin Small Letter A with tilde), to return a length value of 2? Really? > It's entirely reasonable for a concept to fit in multiple > categories simultaneously. Reasonable? Perhaps... Practical? No way! From michele.simionato at gmail.com Sun Jul 16 00:58:15 2017 From: michele.simionato at gmail.com (Michele Simionato) Date: Sat, 15 Jul 2017 21:58:15 -0700 (PDT) Subject: Decorating coroutines Message-ID: <94739bc9-0922-4398-b99a-cb2cdea55daa@googlegroups.com> I have just released version 4.1.1 of the decorator module. The new feature is that it is possible to decorate coroutines. Here is an example of how to define a decorator `log_start_stop` that can be used to trace coroutines: $ cat x.py import time import logging from asyncio import get_event_loop, sleep, wait from decorator import decorator @decorator async def log_start_stop(coro, *args, **kwargs): logging.info('Starting %s%s', coro.__name__, args) t0 = time.time() await coro(*args, **kwargs) dt = time.time() - t0 logging.info('Ending %s%s after %d seconds', coro.__name__, args, dt) @log_start_stop async def task(n): # a do nothing task for i in range(n): await sleep(1) if __name__ == '__main__': logging.basicConfig(level=logging.INFO) tasks = [task(3), task(2), task(1)] get_event_loop().run_until_complete(wait(tasks)) This will print something like this: ~$ python3 x.py INFO:root:Starting task(1,) INFO:root:Starting task(3,) INFO:root:Starting task(2,) INFO:root:Ending task(1,) after 1 seconds INFO:root:Ending task(2,) after 2 seconds INFO:root:Ending task(3,) after 3 seconds The trouble is that at work I am forced to maintain compatibility with Python 2.7, so I do not have significant code using coroutines. If there are people out there which use a lot of coroutines and would like to decorate them, I invite you to try out the decorator module and give me some feedback if you find errors or strange behaviors. I am not aware of any issues, but one is never sure with new features. Thanks for your help, Michele Simionato From lepto.python at gmail.com Sun Jul 16 01:11:18 2017 From: lepto.python at gmail.com (oyster) Date: Sun, 16 Jul 2017 13:11:18 +0800 Subject: is @ operator popular now? In-Reply-To: References: Message-ID: sorry, I mean "PEP 465 - A dedicated infix operator for matrix multiplication" on https://docs.python.org/3/whatsnew/3.5.html#whatsnew-pep-465 2017-07-15 20:05 GMT+08:00 Matt Wheeler : > On Sat, 15 Jul 2017, 12:35 oyster, wrote: >> >> as the title says. has @ been used in projects? > > > Strictly speaking, @ is not an operator. > It delimits a decorator statement (in python statements and operations are > not the same thing). > However, to answer the question you actually asked, yes, all the time. > > For specific examples, see: > pytest's fixtures > contextlib.contextmanager (makes creating context managers mich simpler in > most cases) > @property @classmethod etc. etc. (I sometimes see these used a bit too > freely, when a plain attribute or a function at the module level would be > more appropriate) > > -- > > -- > Matt Wheeler > http://funkyh.at From rantingrickjohnson at gmail.com Sun Jul 16 01:12:36 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 15 Jul 2017 22:12:36 -0700 (PDT) Subject: "Edit with IDLE" doesn't work any more ? In-Reply-To: References: Message-ID: <5ca06531-aa00-498e-902c-832088090e97@googlegroups.com> On Friday, April 28, 2017 at 8:23:43 AM UTC-5, Peter Otten wrote: > Stefan Ram wrote: > > > Peter Otten <__peter__ at web.de> writes: > >>one of the modules in Python's standard library IDLE will try to run with > >>your module rather than the one it actually needs. Common candidates are > >>code.py or string.py, but there are many more. > > > > I know this from Java: > > > > When you write a program > > > > ... main( final String[] args ) ... > > > > and then create a file ?String.class? in the program's > > directory, the program usually will not work anymore. > > > > However, in Java one can use an absolute path as in, > > > > ... main( final java.lang.String[] args ) ... > > > > , in which case the program will still work in the > > presence of such a ?String.class? file. > > > > I wonder whether Python also might have such a kind > > of robust "absolute addressing" of a module. > > While I would welcome such a "reverse netloc" scheme or at least a "std" > toplevel package that guarantees imports from the standard library I fear > the pain is not yet big enough ;) The pain will only get more intense with time. This is an issue that Python3 should have solved when it broke so much backwards compatibility. Better to break it all at once; than again, and again, and again. From steve at pearwood.info Sun Jul 16 01:37:15 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 16 Jul 2017 05:37:15 GMT Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> Message-ID: <596afb8b$0$11093$c3e8da3@news.astraweb.com> On Sun, 16 Jul 2017 11:32:16 +1000, Chris Angelico wrote: > On Sun, Jul 16, 2017 at 11:20 AM, Rick Johnson > wrote: >> On Saturday, July 15, 2017 at 7:29:14 PM UTC-5, Chris Angelico wrote: >>> [...] Also, that doesn't deal with U+200B or U+180E, which have >>> well-defined widths *smaller* than typical Latin letters. (200B is a >>> zero-width space. Is it a character?) >> >> Of *COURSE* it's a character. >> >> Would you also consider 0 not to be a number? >> >> Sheesh! > > Exactly. That's my point. Even in a monospaced font, U+200B is a > character, yet it is by rule a zero-width character. So even in a > monospaced font, some characters must vary in width. In a *well-designed* *bug-free* monospaced font, all code points should be either zero-width or one column wide. Or two columns, if the font supports East Asian fullwidth characters. In practice, no single font is going to cover the entire range of Unicode. So one might hope for a *well-designed* *bug-free* FAMILY of monospaced fonts which, between them, cover the entire range, and agree on the width of a column. But even in this best of all possible situations, you can't make everyone happy, because there exist *thin spaces* which should render as a fraction of the width of a regular space. But a monospaced font can't do that: it either makes the thin space zero-width, or a full column. Monospace is by its very nature a compromise on the "natural" width of the characters. A sometimes *useful* compromise, but it cannot solve all problems. -- Steve From steve at pearwood.info Sun Jul 16 01:44:38 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 16 Jul 2017 05:44:38 GMT Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> <854lud861l.fsf@benfinney.id.au> Message-ID: <596afd45$0$11093$c3e8da3@news.astraweb.com> On Sun, 16 Jul 2017 12:33:10 +1000, Ben Finney wrote: > And yet the ASCII and Unicode standard says code point 0x0A (U+000A LINE > FEED) is a character, by definition. [...] > > Is an acute accent a character? > > Yes, according to Unicode. ??? (U+0301 ACUTE ACCENT) is a character. Do you have references for those claims? Because I'm pretty sure that Unicode is very, very careful to never use the word "character" in a formal or normative manner, only as an informal term for "the kinds of things that regular folk consider letters or characters or similar". And I don't think regular folks would know what a line feed was if it jumped out of their computer and bit them :-) They would know what an accent is, and I doubt they would consider an accent not on a base letter to be a character. (I know I don't.) -- Steve From rosuav at gmail.com Sun Jul 16 01:45:12 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 15:45:12 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: Message-ID: On Sun, Jul 16, 2017 at 2:33 PM, Rustom Mody wrote: > On Sunday, July 16, 2017 at 4:09:16 AM UTC+5:30, Mikhail V wrote: >> On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: >> > Random access to code points is as uninteresting as random access to >> > UTF-8 bytes. >> > I might want random access to the "Grapheme clusters, a.k.a.real >> > characters". >> >> What _real_ characters are you referring to? >> If your data has "?" (U00E1), then it is one real character, >> if you have "a" (U0061) and "?" (U02CA) then it is _two_ >> real characters. So in both cases you have access to code points = >> real characters. > > Right now in an adjacent mailing list (debian) I see someone signed off with a > > gr?? > > I guess the third character is a u with some ?dirt? > Whats the fourth? It's a "sharp S". Tell me, is "?" an a with some 'dirt', or is it a separate character? Is "i" an ? with some dirt, or a separate letter? Oh wait, you probably think that "i" is a letter, and "?" is the same letter but with some dirt missing. What about "p"? Is that just "d" written the wrong way up? At what point does something merit being called a different letter? ALL of these are unique characters. If you look up the alphabetization rules for Norwegian, Turkish, and English, you'll see that "?" is not "a", that "?" is not "i", and that "p" is not "d". ChrisA From rosuav at gmail.com Sun Jul 16 01:53:39 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 15:53:39 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <84c8171e-9736-400a-b411-e0dadc61e76a@googlegroups.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> <84c8171e-9736-400a-b411-e0dadc61e76a@googlegroups.com> Message-ID: On Sun, Jul 16, 2017 at 2:25 PM, Rick Johnson wrote: > But the two "realms" and two "character types" are but only a > small sample of the syntactical complexity of Python > strings. For we haven't even discussed the many types of > string literals that Python defines. Some include: > > (1) "Normal Strings" > (2) r"Raw Strings > (3) b"Byte Strings" > (4) u"Unicode Strings" > (5) ru"Raw Unicode" > (6) ur'Unicode "that is _raw_"' > (7) f"Format literals" > ... > > Whew! There are only two types of *string objects* in Python: Unicode strings and byte strings. All the above are just ways of encoding those in your source code. That's all. (And f-strings aren't really strings, but expressions.) There is only one type of *integer object* in Python, yet there are many forms of literal: * decimal - 1234 * octal - 0o2322 * hexadecimal - 0x4d2 * binary - 0b10011010010 * the above, with separation - 1_234, 0b100_1101_0010, etc None of this has anything to do with the current discussion. *ANYTHING*. Please do not introduce red herrings. > Chris was arguing that zero width spaces should not be > counted as characters when the `len()` function is applied > to the string, for which i disagree on the basis of > consistency. My first reaction is: "Why would you inject a > char into a string -- even a zero-width char! -- and then > expect that the char should not affect the length of the > string as returned by `len`?" Did you read my emails? I was never arguing that. > Being that strings (on the highest level) are merely linear > arrays of chars, such an assumption defies all logic. > Furthermore, the length of a string (in chars) and the > "perceived" length of a string (when rendered on a screen, > or printed on paper), are in no way relevant to one another. "chars" meaning what? We still don't have any definition of "character" here. In Python, strings are arrays of code points. > [1] Of course, even in the realms of ASCII, there are chars > that cannot be inserted by the programmer _simply_ by > pressing a single key on the keyboard. But most of these > chars were useless anyways. So we will ignore this small > detail for now. One point to mention is that Unicode > greatly increased the number of useless chars. Define "useless". ChrisA From rosuav at gmail.com Sun Jul 16 02:01:59 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 16 Jul 2017 16:01:59 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596afb8b$0$11093$c3e8da3@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> Message-ID: On Sun, Jul 16, 2017 at 3:37 PM, Steven D'Aprano wrote: > On Sun, 16 Jul 2017 11:32:16 +1000, Chris Angelico wrote: > >> Exactly. That's my point. Even in a monospaced font, U+200B is a >> character, yet it is by rule a zero-width character. So even in a >> monospaced font, some characters must vary in width. > > In a *well-designed* *bug-free* monospaced font, all code points should > be either zero-width or one column wide. Or two columns, if the font > supports East Asian fullwidth characters. > > In practice, no single font is going to cover the entire range of > Unicode. So one might hope for a *well-designed* *bug-free* FAMILY of > monospaced fonts which, between them, cover the entire range, and agree > on the width of a column. Hmm, I'm not sure about that. A font can be monospaced for the most part, yet respect multiple different "width groups" (eg East Asian characters all get one width, while Latin-family characters all get a different width). However, even in the idealized form you describe, you still have to cope with zero-width characters (do they get zero or do they get one column?), and characters that join together (Arabic and Korean (Hangul)). I think the Liberation Sans Mono font (family??) does a pretty good job of making most text columnate well (for instance, the narrow spaces (thin, half, third, etc) all expand to a full space), while not getting too het up about everything being exactly the same number of pixels. If monospacing is, as you say, a compromise, at least Lib Sans Mono has picked a good compromise. ChrisA From rustompmody at gmail.com Sun Jul 16 03:07:07 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Sun, 16 Jul 2017 00:07:07 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <5968b3ea$0$1611$c3e8da3$5496439d@news.astraweb.com> <87fudzuoui.fsf@elektro.pacujo.net> Message-ID: <3dbd34f7-63e7-41f7-8500-454cc20cd405@googlegroups.com> The first book I studied as a CS-student was Structured Computer Organization by Tanenbaum Apart from the detailed description of various machines like PDP-11, IBM-360 etc it suggested the understanding of the computer at 4 levels: - Microprogramming level - "Conventional" machine level (nowadays called ISA) - OS level -- where system calls become new "instructions" - HLL level of languages (like PL-1 !) [The next edition would add the digital abstraction level below the microprogamming level] For me as for many in my generation this book and this leveled view was an important component in my understanding of CS A few years later I studied a course on something called "networks and networking" Again it talked of some 7 (OSI) layers But it didnt make much sense to someone whose only idea of a network was the wire that connected the terminal to the (pretending) mainframe In a subsequent edition of Networking, I found that Tanenbaum had castigated the 7 OSI layers as useless and unnecessary with the 3 TCP layers being more realisitc Still further(?) editions, he would introduce 5 layers as a hybrid between the international but failed OSI standard and the ubiquitous but incomplete TCP standard Why am I saying all this? A layered understanding is the bedrock of our field Except that sometimes it works And sometimes it doesn't The 3 layers here are - UTF-8 layer - Unicode codepoint layer - Linguistically useful (grapheme) layer Marko's statements like UTF-8 is random access is so obviously wrong that (my guess) is that he is not meaning it literally but elliptically as saying: "This excessive layering is not working" OTOH statements like level 2 is 90% good enough for level 3 is in the same ludicrous class as "The world is as wide as the Atlantic ocean" As pointed out above, agglutinating letters is the norm not the exception in the world's languages upto and including (latin in) English From tjreedy at udel.edu Sun Jul 16 03:18:54 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 16 Jul 2017 03:18:54 -0400 Subject: is @ operator popular now? In-Reply-To: References: Message-ID: On 7/15/2017 7:35 AM, oyster wrote: > as the title says. has @ been used in projects? @ was added as an operator for the benefit of numpy, which is a hugh project. I am pretty sure that it is used there, but you can ask on some numpy list. -- Terry Jan Reedy From marko at pacujo.net Sun Jul 16 03:55:17 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Sun, 16 Jul 2017 10:55:17 +0300 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <87vamshl3u.fsf@elektro.pacujo.net> Mikhail V : > On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: >> Random access to code points is as uninteresting as random access to >> UTF-8 bytes. I might want random access to the "Grapheme clusters, >> a.k.a.real characters". > > What _real_ characters are you referring to? > If your data has "?" (U00E1), then it is one real character, > if you have "a" (U0061) and "?" (U02CA) then it is _two_ > real characters. So in both cases you have access to code points = > real characters. It's true that confusion is caused by the ambiguity of the term "character." > For metaphysical discussion - in _my_ definition there is no such > "real" character as "?", since it is the "a" glyph with some dirt, so > according to my definition, it should be two separate characters, both > semantically and technically seen. Here's the problem: when the human user types in "?" (with one, two or three keyclicks), they don't know how the computer represents it internally. The Unicode standard allows for two *equivalent* code point sequences (). When the computer outputs the sequence, the visible result is the single letter "?". The human user doesn't know?or care?about the internal representation. The user's expectation is that the visible letter "?" should behave like any other single letter. For example, a text editor should move the cursor past it with a single click of a left or right arrow key. Also, if I perform a regular-expression search in the editor and look for Alv[a?]rez I should get a match with either Alvarez or Alv?rez. > And, in my definition, the whole Unicode is a huge junkyard, to start > with. I don't think anybody denies that. However, it's the best thing available and?more importantly?a universally accepted standard. > But opinions may vary, and in case you prefer or forced to write "?", > then it can be impractical to store it as two characters, regardless > of encoding. Now I'm not following you. Marko From jbezos.dummy at gmail.com Sun Jul 16 05:26:32 2017 From: jbezos.dummy at gmail.com (Javier Bezos) Date: Sun, 16 Jul 2017 11:26:32 +0200 Subject: Connecting Google News Message-ID: Google News used to fail with the high level functions provided by httplib and the like. However, I found this piece of code somewhere: def gopen(): http = httplib.HTTPSConnection('news.google.com') http.request("GET","/news?ned=es_MX" , headers = {"User-Agent":"Mozilla/5.0 (X11; U; Linux i686; es-MX) AppleWebKit/532.8 (KHTML, like Gecko) Chrome/4.0.277.0 Safari/532.8", "Host":'news.google.com', "Accept": "*/*"}) return http.getresponse() A few days ago, Google News has been revamped and it doesn't work any more (2.6/Win7, 2.7/OSX and, with minimal changes, 3.6/Win7), because the page contents is empty. The code itself doesn't raise any errors. Which is the proper way to do it now? I must stick to the standard libraries. The returned headers are: ---------------------- [('Content-Type', 'application/binary'), ('Cache-Control', 'no-cache, no-store, max-age=0, must-revalidate'), ('Pragma', 'no-cache'), ('Expires', 'Mon, 01 Jan 1990 00:00:00 GMT'), ('Date', 'Thu, 13 Jul 2017 16:37:48 GMT'), ('Location', 'https://news.google.com/news/?ned=es_mx&hl=es'), ('Strict-Transport-Security', 'max-age=10886400'), ('P3P', 'CP="This is not a P3P policy! See ' 'https://support.google.com/accounts/answer/151657?hl=en for more info."'), ('Server', 'ESF'), ('Content-Length', '0'), ('X-XSS-Protection', '1; mode=block'), ('X-Frame-Options', 'SAMEORIGIN'), ('X-Content-Type-Options', 'nosniff'), ('Set-Cookie', 'NID=107=qwH7N2hB12zVGfFzrAC2CZZNhrnNAVLEmTvDvuSzzw6mSlta9D2RDZVP9t5gEcq_WJjZQjDSWklJ7LElSnAZnHsiF4CXOwvGDs2tjrXfP41LE-6LafdA86GO3sWYnfWs;Domain=.google.com;Path=/;Expires=Fri, ' '12-Jan-2018 16:37:48 GMT;HttpOnly'), ('Alt-Svc', 'quic=":443"; ma=2592000; v="39,38,37,36,35"')] ----------------------- `read()` is empty string ('' or b''). `status` is 302. `reason` is `Found`. Javier From __peter__ at web.de Sun Jul 16 06:14:15 2017 From: __peter__ at web.de (Peter Otten) Date: Sun, 16 Jul 2017 12:14:15 +0200 Subject: Connecting Google News References: Message-ID: Javier Bezos wrote: > Google News used to fail with the high level functions provided by httplib > and the like. However, I found this piece of code somewhere: > > def gopen(): > http = httplib.HTTPSConnection('news.google.com') > http.request("GET","/news?ned=es_MX" , When you change that to http.request("GET","/news/headlines?ned=es_mx&hl=es" , you get a non-empty return. Most of the actual content seems to be buried in javascript though. > headers = > {"User-Agent":"Mozilla/5.0 (X11; U; Linux i686; es-MX) > AppleWebKit/532.8 (KHTML, like Gecko) Chrome/4.0.277.0 Safari/532.8", > "Host":'news.google.com', > "Accept": "*/*"}) > return http.getresponse() > > A few days ago, Google News has been revamped and it doesn't work any more > (2.6/Win7, 2.7/OSX and, with minimal changes, 3.6/Win7), because the page > contents is empty. The code itself doesn't raise any errors. Which is the > proper way to do it now? I must stick to the standard libraries. From python-oren at ben-kiki.org Sun Jul 16 06:17:50 2017 From: python-oren at ben-kiki.org (Oren Ben-Kiki) Date: Sun, 16 Jul 2017 13:17:50 +0300 Subject: Difference in behavior of GenericMeta between 3.6.0 and 3.6.1 Message-ID: TL;DR: We need improved documentation of the way meta-classes behave for generic classes, and possibly reconsider the way "__setattr__" and "__getattribute__" behave for such classes. I am using meta-programming pretty heavily in one of my projects. It took me a while to figure out the dance between meta-classes and generic classes in Python 3.6.0. I couldn't find good documentation for any of this (if anyone has a good link, please share...), but with a liberal use of "print" I managed to reverse engineer how this works. The behavior isn't intuitive but I can understand the motivation (basically, "type annotations shall not change the behavior of the program"). For the uninitiated: * It turns out that there are two kinds of instances of generic classes: the "unspecialized" class (basically ignoring type parameters), and "specialized" classes (created when you write "Foo[Bar]", which know the type parameters, "Bar" in this case). * This means the meta-class "__new__" method is called sometimes to create the unspecialized class, and sometimes to create a specialized one - in the latter case, it is called with different arguments... * No object is actually an instance of the specialized class; that is, the "__class__" of an instance of "Foo[Bar]" is actually the unspecialized "Foo" (which means you can't get the type parameters by looking at an instance of a generic class). So far, so good, sort of. I implemented my meta-classes to detect whether they are creating a "specialized" or "unspecialized" class and behave accordingly. However, these meta-classes stopped working when switching to Python 3.6.1. The reason is that in Python 3.6.1, a "__setattr__" implementation was added to "GenericMeta", which redirects the setting of an attribute of a specialized class instance to set the attribute of the unspecialized class instance instead. This causes code such as the following (inside the meta-class) to behave in a mighty confusing way: if is-not-specialized: cls._my_attribute = False else: # Is specialized: cls._my_attribute = True assert cls._my_attribute # Fails! As you can imagine, this caused us some wailing and gnashing of teeth, until we figured out (1) that this was the problem and (2) why it was happening. Looking into the source code in "typing.py", I see that I am not the only one who had this problem. Specifically, the implementers of the "abc" module had the exact same problem. Their solution was simple: the "GenericMeta.__setattr__" code explicitly tests whether the attribute name starts with "_abc_", in which case it maintains the old behavior. Obviously, I should not patch the standard library typing.py to preserve "_my_attribute". My current workaround is to derive from GenericMeta, define my own "__setattr__", which preserves the old behavior for "_my_attribute", and use that instead of the standard GenericMeta everywhere. My code now works in both 3.6.0 and 3.6.1. However, I think the following points are worth fixing and/or discussion: * This is a breaking change, but it isn't listed in https://www.python.org/downloads/release/python-361/ - it should probably be listed there. * In general it would be good to have some documentation on the way that meta-classes and generic classes interact with each other, as part of the standard library documentation (apologies if it is there and I missed it... link?) * I'm not convinced the new behavior is a better default. I don't recall seeing a discussion about making this change, possibly I missed it (link?) * There is a legitimate need for the old behavior (normal per-instance attributes). For example, it is needed by the "abc" module (as well as my project). So, some mechanism should be recommended (in the documentation) for people who need the old behavior. * Separating between "really per instance" attributes and "forwarded to the unspecialized instance" attributes based on their prefix seems to violate "explicit is better than implicit". For example, it would have been explicit to say "cls.__unspecialized__.attribute" (other explicit mechanisms are possible). * Perhaps the whole notion of specialized vs. unspecialized class instances needs to be made more explicit in the GenericMeta API... * Finally and IMVHO most importantly, it is *very* confusing to override "__setattr__" and not override "__getattribute__" to match. This gives rise to code like "cls._foo = True; assert cls._foo" failing. This feels wrong.... And presumably fixing the implementation so that "__getattribute__" forwards the same set of attributes to the "unspecialized" instance wouldn't break any code... Other than code that already broken due to the new functionality, that is. From kwpolska at gmail.com Sun Jul 16 06:18:03 2017 From: kwpolska at gmail.com (Chris Warrick) Date: Sun, 16 Jul 2017 12:18:03 +0200 Subject: Connecting Google News In-Reply-To: References: Message-ID: On 16 July 2017 at 11:26, Javier Bezos wrote: > Google News used to fail with the high level functions provided by httplib > and the like. However, I found this piece of code somewhere: > > def gopen(): > http = httplib.HTTPSConnection('news.google.com') > http.request("GET","/news?ned=es_MX" , > headers = > {"User-Agent":"Mozilla/5.0 (X11; U; Linux i686; es-MX) > AppleWebKit/532.8 (KHTML, like Gecko) Chrome/4.0.277.0 Safari/532.8", > "Host":'news.google.com', > "Accept": "*/*"}) > return http.getresponse() > > A few days ago, Google News has been revamped and it doesn't work any more > (2.6/Win7, 2.7/OSX and, with minimal changes, 3.6/Win7), because the page > contents is empty. The code itself doesn't raise any errors. Which is the > proper way to do it now? I must stick to the standard libraries. Why? The Python standard library doesn?t have anything good for HTTP. * httplib is fairly low-level, and it does not support something as basic as redirects; * urllib.request (urllib2 in Python 2) is slightly better; * but even the official docs for both redirect to requests: http://docs.python-requests.org/en/master/ for a high level interface. (Also, please upgrade your Windows box to run Python 2.7.) > The returned headers are: > > ---------------------- > [('Content-Type', 'application/binary'), > ('Cache-Control', 'no-cache, no-store, max-age=0, must-revalidate'), > ('Pragma', 'no-cache'), > ('Expires', 'Mon, 01 Jan 1990 00:00:00 GMT'), > ('Date', 'Thu, 13 Jul 2017 16:37:48 GMT'), > ('Location', 'https://news.google.com/news/?ned=es_mx&hl=es'), > ('Strict-Transport-Security', 'max-age=10886400'), > ('P3P', > 'CP="This is not a P3P policy! See ' > 'https://support.google.com/accounts/answer/151657?hl=en for more > info."'), > ('Server', 'ESF'), > ('Content-Length', '0'), > ('X-XSS-Protection', '1; mode=block'), > ('X-Frame-Options', 'SAMEORIGIN'), > ('X-Content-Type-Options', 'nosniff'), > ('Set-Cookie', > 'NID=107=qwH7N2hB12zVGfFzrAC2CZZNhrnNAVLEmTvDvuSzzw6mSlta9D2RDZVP9t5gEcq_WJjZQjDSWklJ7LElSnAZnHsiF4CXOwvGDs2tjrXfP41LE-6LafdA86GO3sWYnfWs;Domain=.google.com;Path=/;Expires=Fri, > ' > '12-Jan-2018 16:37:48 GMT;HttpOnly'), > ('Alt-Svc', 'quic=":443"; ma=2592000; v="39,38,37,36,35"')] > ----------------------- > > `read()` is empty string ('' or b''). `status` is 302. `reason` is `Found`. https://en.wikipedia.org/wiki/HTTP_302 See that Location header? The web server wants to redirect you somewhere. Your low-level HTTP library does not handle redirects automatically, so you?d need to take care of that yourself. -- Chris Warrick PGP: 5EAAEA16 From jbezos.dummy at gmail.com Sun Jul 16 06:55:45 2017 From: jbezos.dummy at gmail.com (Javier Bezos) Date: Sun, 16 Jul 2017 12:55:45 +0200 Subject: Connecting Google News In-Reply-To: References: Message-ID: Chris, > (Also, please upgrade your Windows box to run Python 2.7.) It's not /my/ Windows box. I'm allowed to run my script, that's all. My Windows box is actually that with 3.6. >> http = httplib.HTTPSConnection('news.google.com') >> http.request("GET","/news?ned=es_MX" , >> ('Location', 'https://news.google.com/news/?ned=es_mx&hl=es'), ... > See that Location header? The web server wants to redirect you > somewhere. Your low-level HTTP library does not handle redirects > automatically, so you?d need to take care of that yourself. I didn't notice the bar just before ?ned ! I don't know how many times I've compared the URLs without realizing it was added. Silly me! Thank you Javier From jbezos.dummy at gmail.com Sun Jul 16 06:59:51 2017 From: jbezos.dummy at gmail.com (Javier Bezos) Date: Sun, 16 Jul 2017 12:59:51 +0200 Subject: Connecting Google News In-Reply-To: References: Message-ID: Peter, > http.request("GET","/news/headlines?ned=es_mx&hl=es" , Thank you. It works, too. Javier From skip.montanaro at gmail.com Sun Jul 16 09:03:10 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Sun, 16 Jul 2017 08:03:10 -0500 Subject: Connecting Google News In-Reply-To: References: Message-ID: Peter> Most of the actual content seems to be buried in javascript though. Peeking at it, almost all of the useful content appears to be data. It doesn't seem like snipping it out and interpreting it as JSON would be terribly difficult. Perhaps no JS engine required. Skip From __peter__ at web.de Sun Jul 16 09:29:22 2017 From: __peter__ at web.de (Peter Otten) Date: Sun, 16 Jul 2017 15:29:22 +0200 Subject: Difference in behavior of GenericMeta between 3.6.0 and 3.6.1 References: Message-ID: Oren Ben-Kiki wrote: > TL;DR: We need improved documentation of the way meta-classes behave for > generic classes, and possibly reconsider the way "__setattr__" and > "__getattribute__" behave for such classes. The typing module is marked as "provisional", so you probably have to live with the incompatibilities. As to your other suggestions/questions, I'm not sure where the actual discussion is taking place -- roughly since the migration to github python- dev and bugs.python.org are no longer very useful for outsiders to learn what's going on. A random walk over the github site found https://github.com/python/typing/issues/392 Maybe you can make sense of that? Personally, I'm not familiar with the evolving type system and still wondering whether I should neglect or reject... From python-oren at ben-kiki.org Sun Jul 16 09:43:45 2017 From: python-oren at ben-kiki.org (Oren Ben-Kiki) Date: Sun, 16 Jul 2017 16:43:45 +0300 Subject: Difference in behavior of GenericMeta between 3.6.0 and 3.6.1 In-Reply-To: References: Message-ID: Yes, it sort-of makes sense... I'll basically re-post my question there. Thanks for the link! Oren. On Sun, Jul 16, 2017 at 4:29 PM, Peter Otten <__peter__ at web.de> wrote: > Oren Ben-Kiki wrote: > > > TL;DR: We need improved documentation of the way meta-classes behave for > > generic classes, and possibly reconsider the way "__setattr__" and > > "__getattribute__" behave for such classes. > > The typing module is marked as "provisional", so you probably have to live > with the incompatibilities. > > As to your other suggestions/questions, I'm not sure where the actual > discussion is taking place -- roughly since the migration to github python- > dev and bugs.python.org are no longer very useful for outsiders to learn > what's going on. > > A random walk over the github site found > > https://github.com/python/typing/issues/392 > > Maybe you can make sense of that? > > Personally, I'm not familiar with the evolving type system and still > wondering whether I should neglect or reject... > > -- > https://mail.python.org/mailman/listinfo/python-list > From rantingrickjohnson at gmail.com Sun Jul 16 10:40:14 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 16 Jul 2017 07:40:14 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87vamshl3u.fsf@elektro.pacujo.net> References: <87vamshl3u.fsf@elektro.pacujo.net> Message-ID: <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> On Sunday, July 16, 2017 at 2:55:57 AM UTC-5, Marko Rauhamaa wrote: > Mikhail V : > > On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: > > > > > > Random access to code points is as uninteresting as > > > random access to UTF-8 bytes. I might want random access > > > to the "Grapheme clusters, a.k.a.real characters". > > > > What _real_ characters are you referring to? If your data > > has "?" (U00E1), then it is one real character, if you > > have "a" (U0061) and "?" (U02CA) then it is _two_ real > > characters. So in both cases you have access to code > > points = real characters. > > It's true that confusion is caused by the ambiguity of the > term "character." > > > For metaphysical discussion - in _my_ definition there is > > no such "real" character as "?", since it is the "a" glyph > > with some dirt, so according to my definition, it should > > be two separate characters, both semantically and > > technically seen. > > Here's the problem: when the human user types in "?" (with > one, two or three keyclicks), they don't know how the > computer represents it internally. The Unicode standard > allows for two *equivalent* code point sequences ( https://en.wikipedia.org/wiki/Unicode_equivalence>). When > the computer outputs the sequence, the visible result is > the single letter "?". The human user doesn't know?or > care?about the internal representation. *EXACTLY*. But your statement is far too general. Not only need not the _human_user_ be concerned with these low level aspects of strings, but the _programmer_ need not be concerned either. The programmer should only see strings from a practical standpoint: "Can i index the chars within them?" "Can i determine the length of them?" "Can i slice and dice and combine them?" "Can i trust that the character positions will maintain order?" "Can i, and my target users, display them in a human readable form using various rendering specifications defined by graphic designers (aka: font-o-philes)?" If the answer to all of these questions is *YES*, then you know all you need to know about strings. Now get to work!!! > The user's expectation is that the visible letter "?" > should behave like any other single letter. For example, a > text editor should move the cursor past it with a single > click of a left or right arrow key. Also, if I perform a > regular-expression search in the editor and look for > > Alv[a?]rez > > I should get a match with either Alvarez or Alv?rez. While what you say is relevant to _text_editors_ and sub string searching tools, you have wandered beyond the topic we are discussing here, which is practical interfacing between a programmer and his/her strings. How a text editor handles strings is irrelevant to a programmer. Unless of course we are writing a custome text editor software ourselves. In which case we can be the BDFL for a day, or two. *wink* > > And, in my definition, the whole Unicode is a huge > > junkyard, to start with. > > I don't think anybody denies that. However, it's the best > thing available and?more importantly?a universally accepted > standard. > > > But opinions may vary, and in case you prefer or forced to > > write "?", then it can be impractical to store it as two > > characters, regardless of encoding. > > Now I'm not following you. Mikhail is referring to the claims made earlier in this thread that accents are themselves distinct characters. Which i think is utter hooey. For instance, some folks here would wish for len("?") to return 2. Does that seem reasonable? From rustompmody at gmail.com Sun Jul 16 11:40:34 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Sun, 16 Jul 2017 08:40:34 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> Message-ID: <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> On Sunday, July 16, 2017 at 8:10:41 PM UTC+5:30, Rick Johnson wrote: > On Sunday, July 16, 2017 at 2:55:57 AM UTC-5, Marko Rauhamaa wrote: > > Mikhail V : > > > On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: > > > > > > > > Random access to code points is as uninteresting as > > > > random access to UTF-8 bytes. I might want random access > > > > to the "Grapheme clusters, a.k.a.real characters". > > > > > > What _real_ characters are you referring to? If your data > > > has "?" (U00E1), then it is one real character, if you > > > have "a" (U0061) and "?" (U02CA) then it is _two_ real > > > characters. So in both cases you have access to code > > > points = real characters. > > > > It's true that confusion is caused by the ambiguity of the > > term "character." > > > > > For metaphysical discussion - in _my_ definition there is > > > no such "real" character as "?", since it is the "a" glyph > > > with some dirt, so according to my definition, it should > > > be two separate characters, both semantically and > > > technically seen. > > > > Here's the problem: when the human user types in "?" (with > > one, two or three keyclicks), they don't know how the > > computer represents it internally. The Unicode standard > > allows for two *equivalent* code point sequences ( > https://en.wikipedia.org/wiki/Unicode_equivalence>). When > > the computer outputs the sequence, the visible result is > > the single letter "?". The human user doesn't know?or > > care?about the internal representation. > > *EXACTLY*. But your statement is far too general. Not only > need not the _human_user_ be concerned with these low level > aspects of strings, but the _programmer_ need not be concerned > either. The programmer should only see strings from a > practical standpoint: > > "Can i index the chars within them?" > > "Can i determine the length of them?" > > "Can i slice and dice and combine them?" > > "Can i trust that the character positions will maintain > order?" > > "Can i, and my target users, display them in a human > readable form using various rendering specifications defined > by graphic designers (aka: font-o-philes)?" > > If the answer to all of these questions is *YES*, then you > know all you need to know about strings. Now get to work!!! > > > The user's expectation is that the visible letter "?" > > should behave like any other single letter. For example, a > > text editor should move the cursor past it with a single > > click of a left or right arrow key. Also, if I perform a > > regular-expression search in the editor and look for > > > > Alv[a?]rez > > > > I should get a match with either Alvarez or Alv?rez. > > While what you say is relevant to _text_editors_ and sub > string searching tools, you have wandered beyond the topic > we are discussing here, which is practical interfacing > between a programmer and his/her strings. How a text editor > handles strings is irrelevant to a programmer. Unless of > course we are writing a custome text editor software > ourselves. In which case we can be the BDFL for a day, or > two. *wink* > > > > And, in my definition, the whole Unicode is a huge > > > junkyard, to start with. > > > > I don't think anybody denies that. However, it's the best > > thing available and?more importantly?a universally accepted > > standard. > > > > > But opinions may vary, and in case you prefer or forced to > > > write "?", then it can be impractical to store it as two > > > characters, regardless of encoding. > > > > Now I'm not following you. > > Mikhail is referring to the claims made earlier in this > thread that accents are themselves distinct characters. > Which i think is utter hooey. For instance, some folks here > would wish for len("?") to return 2. Does that seem > reasonable? $ python Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> len("?") 1 >>> len("a?") 2 Shall we stipulate it to be 1.5? [? Maybe 1? ?] From rantingrickjohnson at gmail.com Sun Jul 16 12:46:07 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 16 Jul 2017 09:46:07 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> Message-ID: On Sunday, July 16, 2017 at 10:41:02 AM UTC-5, Rustom Mody wrote: > On Sunday, July 16, 2017 at 8:10:41 PM UTC+5:30, Rick Johnson wrote: > > On Sunday, July 16, 2017 at 2:55:57 AM UTC-5, Marko Rauhamaa wrote: > > > Mikhail V : > > > > On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: [...] > > Mikhail is referring to the claims made earlier in this > > thread that accents are themselves distinct characters. > > Which i think is utter hooey. For instance, some folks > > here would wish for len("?") to return 2. Does that seem > > reasonable? > > $ python > Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> len("?") > 1 > >>> len("a?") > 2 > > Shall we stipulate it to be 1.5? [? Maybe 1? ?] Well, heck. If we are wad into the fraction weeds as it relates to "character decorations" (aka: accents), we should at least be realistic about it. For instance, the bounding box of that *AHEM* "spec of dirt" (aka: accent) above the "a" is hardly half the size of the bounding box that contains the "a" itself. If i were to guess, i would say something around 0.1-ish of a "real character". So if we are accept your implementation, `len("a?")` would return ~1.1. From ben+python at benfinney.id.au Sun Jul 16 14:12:50 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 17 Jul 2017 04:12:50 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> <854lud861l.fsf@benfinney.id.au> <596afd45$0$11093$c3e8da3@news.astraweb.com> Message-ID: <85vams6yjh.fsf@benfinney.id.au> Steven D'Aprano writes: > On Sun, 16 Jul 2017 12:33:10 +1000, Ben Finney wrote: > > > And yet the ASCII and Unicode standard says code point 0x0A (U+000A > > LINE FEED) is a character, by definition. > [...] > > > Is an acute accent a character? > > > > Yes, according to Unicode. ??? (U+0301 ACUTE ACCENT) is a character. > > Do you have references for those claims? The Unicode Standard frequently uses ?character? as the unit of semantic value that Unicode deals in. See the ?Contents? table for many references. In ?2.2 under the sub-heading ?Characters, Not Glyphs? it defines the term, and thereafter uses ?character? in a way that includes all such units, even formatting codes. See ?2.11 ?Combining Characters? for a definition that includes accent characters like U+0301: Combining Characters. Characters intended to be positioned relative to an associated base character are depicted in the character code charts above, below, or through a dotted circle. The standard even uses the term ?format characters? to refer to code points with a functional purpose and no glyph representation, such as U+000A LINE FEED. > Because I'm pretty sure that Unicode is very, very careful to never > use the word "character" in a formal or normative manner, only as an > informal term for "the kinds of things that regular folk consider > letters or characters or similar". I don't know whether you consider the Core Specification document to be speaking in ?formal or normative manner?. Either way that doesn't affect my point that Unicode does define ?character? and it includes all code points in that definition. If you're going to disqualify anything that isn't ?formal and normative manner? from what we're allowed to infer as the Unicode Standard telling us is a character, then you're going to have to either disregard most of the Core Specification document, or allow it as formal and/or normative. > And I don't think regular folks would know what a line feed was if it > jumped out of their computer and bit them :-) Are we talking about definitions, or are we talking about what regular folks would know? Regular folks know that ?fish? has meaning, but I wouldn't want to try matching that regular-folk knowledge with a definition of what a ?fish? is and is not. Quite frequently, a definition useful for a formal standard is *not* coterminus with what regular folk will think is in our out of that definition. -- \ ?I have said to you to speak the truth is a painful thing. To | `\ be forced to tell lies is much worse.? ?Oscar Wilde, _De | _o__) Profundis_, 1897 | Ben Finney From ben+python at benfinney.id.au Sun Jul 16 14:23:45 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 17 Jul 2017 04:23:45 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> <854lud861l.fsf@benfinney.id.au> <596afd45$0$11093$c3e8da3@news.astraweb.com> <85vams6yjh.fsf@benfinney.id.au> Message-ID: <85pod06y1a.fsf@benfinney.id.au> Ben Finney writes: > Steven D'Aprano writes: > > > Do you have references for those claims? > > The Unicode Standard > frequently uses ?character? as the unit of semantic value that Unicode > deals in. See the ?Contents? table for many references. I omitted to say (though it becomes clearer later in my message) that these references are all in the Core Specification document of the Unicode Standard, version 10.0.0. -- \ ?There's a certain part of the contented majority who love | `\ anybody who is worth a billion dollars.? ?John Kenneth | _o__) Galbraith, 1992-05-23 | Ben Finney From pavol.lisy at gmail.com Sun Jul 16 16:40:04 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sun, 16 Jul 2017 22:40:04 +0200 Subject: Write this accumuator in a functional style In-Reply-To: <5968f6b1$0$1585$c3e8da3$5496439d@news.astraweb.com> References: <59646c01$0$11093$c3e8da3@news.astraweb.com> <87d196dlt5.fsf@nightsong.com> <87zic8pxbe.fsf@nightsong.com> <87pod4fz15.fsf@nightsong.com> <8760ewws3t.fsf@elektro.pacujo.net> <69047ab6-056f-44d9-a536-1a4ccc58d2d2@googlegroups.com> <5967705d$0$1606$c3e8da3$5496439d@news.astraweb.com> <5968f6b1$0$1585$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 7/14/17, Steve D'Aprano wrote: > On Fri, 14 Jul 2017 09:06 am, Ned Batchelder wrote: > >> Steve's summary is qualitatively right, but a little off on the >> quantitative >> details. Lists don't resize to 2*N, they resize to ~1.125*N: >> >> new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : >> 6); >> >> (https://github.com/python/cpython/blob/master/Objects/listobject.c#L49-L58) > > Ah, thanks for the correction. I was going off vague memories of long-ago > discussion (perhaps even as long ago as Python 1.5!) when Tim Peters (I > think > it was) described how list overallocation worked. You could remember it from sets: return set_table_resize(so, so->used>50000 ? so->used*2 : so->used*4); (https://github.com/python/cpython/blob/master/Objects/setobject.c#L239) From mikhailwas at gmail.com Sun Jul 16 19:25:48 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 17 Jul 2017 01:25:48 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: >> On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: >>> Random access to code points is as uninteresting as random access to >>> UTF-8 bytes. I might want random access to the "Grapheme clusters, >>> a.k.a.real characters". >> >> What _real_ characters are you referring to? >> If your data has "?" (U00E1), then it is one real character, >> if you have "a" (U0061) and "?" (U02CA) then it is _two_ >> real characters. So in both cases you have access to code points = >> real characters. >It's true that confusion is caused by the ambiguity of the term >"character." Yes, but you have said "I might want random access to the "Grapheme clusters, a.k.a. real characters" and I had impression that you have some concrete concept of grapheme clusters and some (generally useful) example of implementation. Without concrete examples it is just juggling with the terms. >> But opinions may vary, and in case you prefer or forced to write "?", >> then it can be impractical to store it as two characters, regardless >> of encoding. > Now I'm not following you. For example, I want to type in cyrillic " ???? " (with an acute accent to denote the stress on the last vowel, say for a pronunciation tutorial). Most frequent solution to it would be just typing ? instead of a. And it is indeed most pratical: if I use modifier acute accent character instead, then it will be hard to select/paste such text and it will not render accurately. Obvious consequences we have: ? is not from the cyrillic code range, eg. it will break hyphenation rules, and it will look consistent only if the cyrillic font's "a" has exactly the same look as the latin "a". Not to tell that it is not always possible to find the glyph with the 'right kind of dirt around'. For such cases, technically better solution would be using separate accent character to denote a stroke. In case of font issues it would at least render as, say an apostrophe. Still in practice, just typing "?" works better because editors and even some professional DTP software cannot handle context-based glyph rendering well. In other words, I think the internal representation should use separate modifier character, despite it seems impractical from many points of view. And it _is_ impractical in case one has such things as "?" as frequent character in normal writing (the latter should not be the case for adequate modern writing system though). Mikhail From steve+python at pearwood.info Sun Jul 16 20:59:02 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Mon, 17 Jul 2017 10:59:02 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> Message-ID: <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> On Mon, 17 Jul 2017 01:40 am, Rustom Mody wrote: > On Sunday, July 16, 2017 at 8:10:41 PM UTC+5:30, Rick Johnson wrote: [...] > $ python > Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux > Type "help", "copyright", "credits" or "license" for more information. >>>> len("?") > 1 >>>> len("a?") > 2 > > Shall we stipulate it to be 1.5? [? Maybe 1? ?] Please don't feed the trolls. If you have to respond to Ranting Rick, at least write something sensible that people following this thread might learn from, instead of encouraging his nonsense. I don't believe for a second you seriously would like len(some_string) to return '1?', but just in case anyone is taking that proposal seriously, that would break backwards compatibility. len() must return an int, not a float, a complex number, or a string. If you want to know the length of a string *in bytes*, you have to encode it to bytes first, using some specific encoding, then call len() on those bytes. If you want to know the length of a string *in code points*, then just call len() on the string. If you want to know the height or width of a string in pixels in some specific font, see your GUI toolkit. If you want to know the length of a string in "characters" (graphemes), well, Python doesn't have a built-in function to do that, or a standard library solution. Yet. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From mikhailwas at gmail.com Sun Jul 16 22:28:00 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Mon, 17 Jul 2017 04:28:00 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: ChrisA wrote: >On Sun, Jul 16, 2017 at 2:33 PM, Rustom Mody wrote: >> Right now in an adjacent mailing list (debian) I see someone signed off with a >> >> gr?? >> >> I guess the third character is a u with some ?dirt? >> Whats the fourth? >It's a "sharp S". or "Eszett", is a merge of two symbols that were used in old german texts: "f"-like glyph and "s" glyph, i.e. sort of ligature. Or simply, ? is a symbol that is quite similar to "B". I would just write : gruss since it is simpler to type and has cleaner look. "?" is sort of deprecated, often subsituted with "ss". If I am not mistaken, this substitution is oficially allowed in many regions (what a liberality!). >>Heck even in the English that I learnt in school we had >>?gis, hom?opath etc Similar to the above, historical symbols. These are (should be) deprecated due to legibility issues, roughly speaking. OTOH good for freaking-out. Like: I was in ?gypt. and a reader so: aaaeeeegypt ChrisA wrote: >Tell me, is "?" an a with some 'dirt', or is it a separate character? >From the way you are asking, it seems that you are planning some tricky business again... Hope not to argue on terminology again, ? simply makes the text flow inconsistent, such things are parasitic for readability regardless if someone proclaims it a separate character or not. In a reader-oriented medium should be used only as a last resort. Looks like "a" whith a circle above, so yes, an "a" with a good deal of dirt. >Is "i" an ? with some dirt, or a separate letter? Oh wait, you >probably think that "i" is a letter, and "?" is the same letter but >with some dirt missing. "i" is a letter, you can't just remove the dot. So there can be just dirt and there is 'dirt' which is in fact the natural part of the letter. Like a serif for example. but I am not expecting your acceptance of these statements, I am just telling what follows from my long experience with the topic. Though you can try to replace "i" with "?" globally in a text and there are chances you will notice something. Then you can try also with ?. >What about "p"? Is that just "d" written the >wrong way up? Sort of. The early designers did not find a better solution than taking the rotated version of one glyph. Are you curious about all other letters? Then probably you should start trying to design a legible typeface. But ideally you should try to design a typeface from scratch, say some 20 glyphs, not just a Latin-based variation, but truly from scratch. Then some question should become more transparent, words are too weak in transmitting these kind of things. >At what point does something merit being called a >different letter? For truly different, when the structural difference is significant, i.e. much more significant than the difference between "?" and "i". Yes in Turkish both are used. And what can I say, its misfortunate for the users: suboptimal for legibility + non ascii typing. But could be much worse, look at Vietnamese writings. Mikhail From rantingrickjohnson at gmail.com Sun Jul 16 22:29:05 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 16 Jul 2017 19:29:05 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Sunday, July 16, 2017 at 8:28:57 PM UTC-5, Steve D'Aprano wrote: > On Mon, 17 Jul 2017 01:40 am, Rustom Mody wrote: > > > On Sunday, July 16, 2017 at 8:10:41 PM UTC+5:30, Rick Johnson wrote: > [...] > > $ python > > Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) > > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux > > Type "help", "copyright", "credits" or "license" for more information. > >>>> len("?") > > 1 > >>>> len("a?") > > 2 > > > > Shall we stipulate it to be 1.5? [? Maybe 1? ?] > > If you have to respond to Ranting Rick, at least write > something sensible that people following this thread might > learn from, instead of encouraging his nonsense. Oh Steven. You couldn't win with that ridiculous Toupee Fallacy (which i plucked from your highly specular crown, BTW [1]) so now you resort to your old trusty and rusty tactic of the ad hominem. @_@ Why are we not surprised? BTW, i noticed your first name is missing the trailing "n" character. What gives? I can only assume that in a submissive gesture towards your buddy Chris, you replaced the "n" with a zero-width space . No? Hmm. Or perhaps you forget your password, again, and had no other choice but to create a new account? Well don't be sad. :-'( In fact, cheer up. :-) Things could be worse, ya know. O;-) PS: Now sod off to that gated community otherwise known as Python-ideas, where you can hide behind your moderator's coat-tails, and spend your time bikeshedding with the other snobbish ilk who infest that group. These are free and open forums here, and your fascist manipulations (no matter how PC they may be) are not welcome here. Who the *HELL* do you think you are, lecturing other people about who they may or may _not_ communicate with? [1] Yeah, i saw that "bad toupee" a mile away! From rosuav at gmail.com Sun Jul 16 23:10:44 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 17 Jul 2017 13:10:44 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: Message-ID: On Mon, Jul 17, 2017 at 12:28 PM, Mikhail V wrote: > ChrisA wrote: >>Tell me, is "?" an a with some 'dirt', or is it a separate character? > > From the way you are asking, it seems that you are planning some tricky > business again... Hope not to argue on terminology again, ? simply > makes the text flow inconsistent, such things are parasitic for > readability regardless if someone proclaims it a separate character or > not. In a reader-oriented medium should be used only as a last resort. > > Looks like "a" whith a circle above, so yes, an "a" with a good deal of dirt. Norwegian people might take issue with that. It's not "a with circle above", it's the distinct letter ? (pronounced as per the sound the letter represents, approximately "aw"). >>Is "i" an ? with some dirt, or a separate letter? Oh wait, you >>probably think that "i" is a letter, and "?" is the same letter but >>with some dirt missing. > > "i" is a letter, you can't just remove the dot. So there can be just > dirt and there is > 'dirt' which is in fact the natural part of the letter. Like a serif > for example. > but I am not expecting your acceptance of these statements, > I am just telling what follows from my long experience with the topic. > Though you can try to replace "i" with "?" globally in a text and there > are chances you will notice something. Then you can try also with ?. Yep! Nobody would take any notice of the fact that you just put dots on all those letters. It's not like it's going to make any difference to anything. We're not dealing with matters of life and death here. Oh wait. https://www.theinquirer.net/inquirer/news/1017243/cellphone-localisation-glitch I'll leave you with that thought. ChrisA From rustompmody at gmail.com Mon Jul 17 00:10:37 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Sun, 16 Jul 2017 21:10:37 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> Message-ID: <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> On Monday, July 17, 2017 at 6:58:57 AM UTC+5:30, Steve D'Aprano wrote: > On Mon, 17 Jul 2017 01:40 am, Rustom Mody wrote: > > > On Sunday, July 16, 2017 at 8:10:41 PM UTC+5:30, Rick Johnson wrote: > [...] > > $ python > > Python 3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) > > [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux > > Type "help", "copyright", "credits" or "license" for more information. > >>>> len("?") > > 1 > >>>> len("a?") > > 2 > > > > Shall we stipulate it to be 1.5? [? Maybe 1? ?] > > Please don't feed the trolls. Its usually called 'joke' Steven! Did the word fall out of your dictionary in the last upgrade? Rick was no more trolling than Marko or you or Chris or Mikhail or anyone else If anyone's trolling its me? len("a?") == 1? is so obviously nonsense on so many levels I did not think "And now ladies (are there any?) and gentlemen I am going to tell a joke!" would be necessary On a more serious note every other post on this (as on many discussing unicode more broadly) is so ridiculously Euro (or Anglo) centric I would not know where to begin. Witness your own? > If you have to respond to Ranting Rick, at least > write something sensible that people following this thread might learn from, > instead of encouraging his nonsense. > > I don't believe for a second you seriously would like len(some_string) to > return '1?', but just in case anyone is taking that proposal seriously, that > would break backwards compatibility. len() must return an int, not a float, a > complex number, or a string. > > If you want to know the length of a string *in bytes*, you have to encode it to > bytes first, using some specific encoding, then call len() on those bytes. > > If you want to know the length of a string *in code points*, then just call > len() on the string. > > If you want to know the height or width of a string in pixels in some specific > font, see your GUI toolkit. > > If you want to know the length of a string in "characters" (graphemes), well, > Python doesn't have a built-in function to do that, or a standard library > solution. Yet. You've given 4 ifs. An L-language may would assume that the atomic units of language-L would be supported. Your 4th if suggests thats ok. Is it? Hint1: Ask your grandmother whether unicode's notion of character makes sense. Ask 10 gmas from 10 language-L's Hint2: When in doubt gma usually is right PS Claims such as Euro (or some other) centricism usually imply a corresponding call for "rights" "equality" etc No such politically correct call is being made or implied (by me) There never was equality in the world; there never will be From rosuav at gmail.com Mon Jul 17 00:38:30 2017 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 17 Jul 2017 14:38:30 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> Message-ID: On Mon, Jul 17, 2017 at 2:10 PM, Rustom Mody wrote: > Hint1: Ask your grandmother whether unicode's notion of character makes sense. > Ask 10 gmas from 10 language-L's > Hint2: When in doubt gma usually is right Often, but definitely not always. For instance, your grandmother probably wouldn't think of "newline" as a character. Quite possibly wouldn't count space, either. On the other hand, I'm pretty sure my grandmothers would have counted Sherlock Holmes as a character. ChrisA From dieter at handshake.de Mon Jul 17 00:40:50 2017 From: dieter at handshake.de (dieter) Date: Mon, 17 Jul 2017 06:40:50 +0200 Subject: Connecting Google News References: Message-ID: <87zic3wu99.fsf@handshake.de> Javier Bezos writes: > Google News used to fail with the high level functions provided by > httplib and the like. However, I found this piece of code somewhere: > ... > A few days ago, Google News has been revamped and it doesn't work any > more (2.6/Win7, 2.7/OSX and, with minimal changes, 3.6/Win7), because > the page contents is empty. The code itself doesn't raise any > errors. Which is the proper way to do it now? I must stick to the > standard libraries. > > The returned headers are: > > ---------------------- > [('Content-Type', 'application/binary'), > ... > ('Location', 'https://news.google.com/news/?ned=es_mx&hl=es'), > ... > > `status` is 302. `status == 302` means a redirect; "Location" gives the new url (to be redirected to). From nad at python.org Mon Jul 17 01:50:33 2017 From: nad at python.org (Ned Deily) Date: Mon, 17 Jul 2017 01:50:33 -0400 Subject: [RELEASE] Python 3.6.2 is now available Message-ID: On behalf of the Python development community and the Python 3.6 release team, I am happy to announce the availability of Python 3.6.2, the second maintenance release of Python 3.6. 3.6.0 was released on 2016-12-22 to great interest and we are now providing the second set of bugfixes and documentation updates for it; the first maintenance release, 3.6.1, was released on 2017-03-31. Detailed information about the changes made in 3.6.2 can be found in the change log here: https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-2 Please see "What?s New In Python 3.6" for more information about the new features in Python 3.6: https://docs.python.org/3.6/whatsnew/3.6.html You can download Python 3.6.2 here: https://www.python.org/downloads/release/python-362/ The next maintenance release of Python 3.6 is expected to follow in about 3 months, around the end of 2017-09. More information about the 3.6 release schedule can be found here: https://www.python.org/dev/peps/pep-0494/ Enjoy! P.S. If you need to download the documentation set for 3.6.2 immediately, you can always find the release version here: https://docs.python.org/release/3.6.2/download.html The most current updated versions will appear here: https://docs.python.org/3.6/ -- Ned Deily nad at python.org -- [] From rustompmody at gmail.com Mon Jul 17 03:07:20 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Mon, 17 Jul 2017 00:07:20 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> Message-ID: <6d51a1a6-10c5-4c3a-9d4c-5011b4f8f649@googlegroups.com> On Monday, July 17, 2017 at 9:41:51 AM UTC+5:30, Rustom Mody wrote: > On a more serious note every other post on this (as on many discussing unicode > more broadly) is so ridiculously Euro (or Anglo) centric I would not know where > to begin. > Witness your own? > Hint1: Ask your grandmother whether unicode's notion of character makes sense. > Ask 10 gmas from 10 language-L's > Hint2: When in doubt gma usually is right To be fair I notice now your subject line "aka real characters" Which suggests that you understand that the gma view may have more validity than the ?-assed ones (currently) supported by python/unicode From marko at pacujo.net Mon Jul 17 03:09:02 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Mon, 17 Jul 2017 10:09:02 +0300 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <87eftffskx.fsf@elektro.pacujo.net> Mikhail V : >>> On Sat, 15 Jul 2017 05:50 pm, Marko Rauhamaa wrote: >>It's true that confusion is caused by the ambiguity of the term >>"character." > > Yes, but you have said "I might want random access to the "Grapheme clusters, > a.k.a. real characters" and I had impression that you have some concrete > concept of grapheme clusters and some (generally useful) example of > implementation. > Without concrete examples it is just juggling with the terms. What did you think of my concrete examples, then? (Say, finding "Alv?rez" with the regular expression "Alv[a?]rez".) > For example, I want to type in cyrillic " ???? " (with an acute accent > to denote the stress on the last vowel, say for a pronunciation > tutorial). Most frequent solution to it would be just typing ? instead > of a. And it is indeed most pratical: if I use modifier acute accent > character instead, then it will be hard to select/paste such text and > it will not render accurately. Thing is, neither you (the user) nor you (the Python programmer) gets to decide how "?" is represented in Unicode. That decision may be made by other programmers (the terminal emulator, the file system or the text editor). Still, everything should be transparent to both you (the user) and you (the Python programmer). Marko From Jack at nichesoftsolutions.com Mon Jul 17 09:28:22 2017 From: Jack at nichesoftsolutions.com (Jack at nichesoftsolutions.com) Date: Mon, 17 Jul 2017 06:28:22 -0700 (PDT) Subject: HOTLIST Message-ID: <48cf4e89-8309-4d10-9ca6-726ff9852a09@googlegroups.com> Hello Professionals, Greetings from NICHE SOFTWARE SOLUTIONS, Thank you for taking time to look over my Mail, This is Jack Stutter from Niche Software Solutions Inc working as Sr Bench sales recruiter, We have very strong bench consultants. I would highly appreciate if you can add me jack at nichesoftsolutions.com in your daily requirement mailing list and keep me posted with your daily C2C requirements or you can directly reach me at 5035362757. Name Technology Experience Visa Location Relocate Manoj Kumar VM WARE 14+ H1B NC OPEN Rahul Chandran Business Intelligence 7+ H1B TX OPEN Soumith Reddy SQL/ PLSQL Developer 6+ H1B OH OPEN Sayed Abualia Technical support/Analyst 6+ H1B WA OPEN Shri Salesforce Developer 5+ H1B CO OPEN Vishnu Kumar QA Analyst (Automation and Manual) 13+ H1B WI OPEN Mahesh QA Analyst(Automation and Manual) 15+ H1B WI OPEN Meenakshi Mahapatra Teradata Developer 7+ L2-EAD NJ NJ Rashi Choudary Guidewire Developer 8+ H1B TX OPEN Sanjay Automation Tester 8+ H1B CA OPEN Rahul Bhardwaj SAP APO 10+ H1B CO OPEN Nisha Rani .Net Developer 6+ H1B MA OPEN Swapna Kanikea Java Developer 11+ H1B VA OPEN Vidur Network 7+ H1B TX OPEN Gnana Selva Infirmatic 6+ H1B TX OPEN Fakrudhin Storage Engineer 10+ H1B TX OPEN Akshay Sourcing Lead 5+ H1B MA OPEN Ramya UI Developer 4+ H4EAD OH OPEN From evanbenadler at gmail.com Mon Jul 17 10:38:53 2017 From: evanbenadler at gmail.com (Evan Adler) Date: Mon, 17 Jul 2017 10:38:53 -0400 Subject: Is this PEP viable? Message-ID: I would like to submit the following proposal. In the logging module, I would like handlers (like file handlers and stream handlers) to have a field for exc_info printing. This way, a call to logger.exception() will write the stack trace to the handlers with this flag set, and only print the message and other info to handlers without the flag set. This allows a single logger to write to a less detailed console output, a less detailed run log, and a more detailed error log. From breamoreboy at gmail.com Mon Jul 17 10:58:22 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Mon, 17 Jul 2017 07:58:22 -0700 (PDT) Subject: Is this PEP viable? In-Reply-To: References: Message-ID: <80bc8b7c-534a-4465-b022-2558a66e418c@googlegroups.com> On Monday, July 17, 2017 at 3:41:12 PM UTC+1, Evan Adler wrote: > I would like to submit the following proposal. In the logging module, I > would like handlers (like file handlers and stream handlers) to have a > field for exc_info printing. This way, a call to logger.exception() will > write the stack trace to the handlers with this flag set, and only print > the message and other info to handlers without the flag set. This allows a > single logger to write to a less detailed console output, a less detailed > run log, and a more detailed error log. What PEP? If you had it as an attachment it won't get through. Kindest regards. Mark Lawrence. From steve+python at pearwood.info Mon Jul 17 11:36:03 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Tue, 18 Jul 2017 01:36:03 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> Message-ID: <596cd965$0$1584$c3e8da3$5496439d@news.astraweb.com> On Mon, 17 Jul 2017 02:10 pm, Rustom Mody wrote: >> Please don't feed the trolls. > > Its usually called 'joke' Steven! Did the word fall out of your dictionary > in the last upgrade? > Rick was no more trolling than Marko Funny you say that. I often think Marko is trolling, but if he is, he does a good job of leaving me in just enough doubt that I'm willing to continue the discussion. As for Rick, I can't tell if he's merely trolling to get a reaction, or he really does believe the crap he spouts off in most of his posts. I'm not sure which would be worse. > or you or Chris or Mikhail or anyone else > If anyone's trolling its me? len("a?") == 1? is so obviously nonsense on so > many levels I did not think > "And now ladies (are there any?) and gentlemen I am going to tell a joke!" > would be necessary And it wouldn't have been necessary, if we didn't have Ranting Rick here to take your proposal seriously. > On a more serious note every other post on this (as on many discussing unicode > more broadly) is so ridiculously Euro (or Anglo) centric I would not know > where to begin. I'm always willing to learn. How am I Euro, or Anglo, centric? > Witness your own? [...] > You've given 4 ifs. Actually I gave five "ifs", plus one other conditional phrase which could have been re-worded as an "if". > An L-language may would assume that the atomic units of language-L would > be supported. Your 4th if suggests thats ok. Is it? Please pardon me for being Anglo-centric, but what's an L-language? People make lots of bad assumptions. For example, they assume that computer arithmetic must follow the same mathematical rules of associativity, commutativity and distributivity that they learned about the Real number system in high school. That assumption is wrong. People assume that the atomic units of language are a simple thing to define, and having defined them, support them in programming languages. That assumption is also wrong. People assume all sorts of falsehoods about programming, and language. So to answer your question, no, it is not okay to assume that the "atomic units of language" (whatever they are) are supported. I don't think that it is even a given that "atomic units of language" exist. To quote a Hindi speaker earlier in this thread, ?? is a letter, and yet it can be decomposed into ?? = ? + ?, so it isn't "atomic". If letters aren't atomic, then what are? So if the "atomic units of language" (letters?) have "subatomic parts", where does that leave us programmers? Shouldn't we be able to manipulate text at the subatomic level? > Hint1: Ask your grandmother whether unicode's notion of character makes sense. What on earth makes you think that my grandmother is a valid judge of whether Unicode makes sense or not? She made some mighty fine chicken soup, and her coffee scroll cake was to die for, but I wouldn't want to ask her to fix my car, perform brain surgery, solve a differential equation, or judge the merits of a technical standard like Unicode. Her English wasn't that great, her Russian was more of a country-bumpkin dialect than Standard Russian, and it was mixed in with a lot of Estonian and Polish as well, and she had *absolutely zero* knowledge of different language systems like Chinese ideographs, Arabic, Hindi, etc. Nor did she know anything about the legacy encodings of the 1980s and 90s. How could she possibly be expected to judge Unicode? She never even handled a computer in her life, let alone program one. How could she judge the complex balancing act between competing requirements that go into Unicode? Its really sad to see somebody who I thought was educated exposing the view that knowledge and education aren't needed to judge complex technical questions, only common sense[1]. Experts? Who needs 'em? > Ask 10 gmas from 10 language-L's > Hint2: When in doubt gma usually is right Would you let your grandmother perform brain surgery on someone you cared for? Well, maybe, if she actually was a brain surgeon. But if not? [1] http://i.imgur.com/jgmwz1q.jpg -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Mon Jul 17 12:04:04 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 18 Jul 2017 02:04:04 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596cd965$0$1584$c3e8da3$5496439d@news.astraweb.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <596cd965$0$1584$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Tue, Jul 18, 2017 at 1:36 AM, Steve D'Aprano wrote: > On Mon, 17 Jul 2017 02:10 pm, Rustom Mody wrote: >> Hint1: Ask your grandmother whether unicode's notion of character makes sense. > > What on earth makes you think that my grandmother is a valid judge of whether > Unicode makes sense or not? > > She made some mighty fine chicken soup, and her coffee scroll cake was to die > for, but I wouldn't want to ask her to fix my car, perform brain surgery, solve > a differential equation, or judge the merits of a technical standard like > Unicode. > > Her English wasn't that great, her Russian was more of a country-bumpkin dialect > than Standard Russian, and it was mixed in with a lot of Estonian and Polish as > well, and she had *absolutely zero* knowledge of different language systems > like Chinese ideographs, Arabic, Hindi, etc. Nor did she know anything about > the legacy encodings of the 1980s and 90s. > > How could she possibly be expected to judge Unicode? She never even handled a > computer in her life, let alone program one. How could she judge the complex > balancing act between competing requirements that go into Unicode? I think the point here is not about judging Unicode, but defining a character. If I were to ask either of my (late) grandmothers what a character is, aside from being told that I am myself quite a character, I'd probably get a reasonably sane response for text in English, Italian, or Dutch. With the possible exception that "ij" might be considered a single letter in Dutch. Except when it isn't. But neither of them is qualified to say whether ? and ? are the same letter or not, as both of them would think they were badly written upper-case N. Nor would I ask either of them whether ? is one character or two. The "ask your grandmother" technique is great for questions of UI within her area of skill, but that's about it. ChrisA From __peter__ at web.de Mon Jul 17 12:24:03 2017 From: __peter__ at web.de (Peter Otten) Date: Mon, 17 Jul 2017 18:24:03 +0200 Subject: Is this PEP viable? References: Message-ID: Evan Adler wrote: > I would like to submit the following proposal. In the logging module, I > would like handlers (like file handlers and stream handlers) to have a > field for exc_info printing. This way, a call to logger.exception() will > write the stack trace to the handlers with this flag set, and only print > the message and other info to handlers without the flag set. This allows a > single logger to write to a less detailed console output, a less detailed > run log, and a more detailed error log. If I understand you correctly this would go into the Formatter rather than the Handler. E. g.: $ cat log_exception_format.py import logging import sys class MyFormatter(logging.Formatter): def __init__(self, fmt=None, datefmt=None, style='%', verbose=0): super().__init__(fmt, datefmt, style) self.verbose = verbose def formatException(self, ei): if self.verbose < 1: return "" elif self.verbose < 2: return "{0[0].__name__}: {0[1]}".format(ei) else: return super().formatException(ei) formatter = MyFormatter(logging.BASIC_FORMAT, verbose=sys.argv.count("-v")) handler = logging.StreamHandler() handler.setFormatter(formatter) g = logging.getLogger() g.addHandler(handler) def f(n): if n > 0: return f(n-1) else: 1/0 try: f(3) except: g.exception("foo") $ python3 log_exception_format.py ERROR:root:foo $ python3 log_exception_format.py -v ERROR:root:foo ZeroDivisionError: division by zero $ python3 log_exception_format.py -v -v ERROR:root:foo Traceback (most recent call last): File "log_exception_format.py", line 31, in f(3) File "log_exception_format.py", line 27, in f return f(n-1) File "log_exception_format.py", line 27, in f return f(n-1) File "log_exception_format.py", line 27, in f return f(n-1) File "log_exception_format.py", line 29, in f 1/0 ZeroDivisionError: division by zero $ (Note that this is just a sketch; for the above to work reliably the format() method has to be changed to avoid caching the result of the formatException() call) From rhodri at kynesim.co.uk Mon Jul 17 12:43:26 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Mon, 17 Jul 2017 17:43:26 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> Message-ID: On 17/07/17 05:10, Rustom Mody wrote: > Hint1: Ask your grandmother whether unicode's notion of character makes sense. > Ask 10 gmas from 10 language-L's > Hint2: When in doubt gma usually is right "For every complex problem there is an answer that is clear, simple and wrong." (H.L. Mencken). Unfortunately grandmothers outside their areas of expertise are particularly prone to finding those answers. -- Rhodri James *-* Kynesim Ltd From steve+python at pearwood.info Mon Jul 17 12:57:50 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Tue, 18 Jul 2017 02:57:50 +1000 Subject: Users of namedtuple: do you use the _source attribute? Message-ID: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> collections.namedtuple generates a new class using exec, and records the source code for the class as a _source attribute. Although it has a leading underscore, it is actually a public attribute. The leading underscore distinguishes it from a named field potentially called "source", e.g. namedtuple("klass", ['source', 'destination']). There is some discussion on Python-Dev about: - changing the way the namedtuple class is generated which may change the _source attribute - or even dropping it altogether in order to speed up namedtuple and reduce Python's startup time. Is there anyone here who uses the namedtuple _source attribute? My own tests suggest that changing from the current implementation to one similar to this recipe here: https://code.activestate.com/recipes/578918-yet-another-namedtuple/ which only uses exec to generate the __new__ method, not the entire class, has the potential to speed up namedtuple by a factor of four. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From D.Strohl at F5.com Mon Jul 17 13:56:37 2017 From: D.Strohl at F5.com (Dan Strohl) Date: Mon, 17 Jul 2017 17:56:37 +0000 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <39d41b30aa364e27a4e9396011a8d573@F5.com> I have never used it personally. It always looked interesting, but I never ran into a need to generate the source for it. -----Original Message----- From: Python-list [mailto:python-list-bounces+d.strohl=f5.com at python.org] On Behalf Of Steve D'Aprano Sent: Monday, July 17, 2017 9:58 AM To: python-list at python.org Subject: Users of namedtuple: do you use the _source attribute? collections.namedtuple generates a new class using exec, and records the source code for the class as a _source attribute. Although it has a leading underscore, it is actually a public attribute. The leading underscore distinguishes it from a named field potentially called "source", e.g. namedtuple("klass", ['source', 'destination']). There is some discussion on Python-Dev about: - changing the way the namedtuple class is generated which may change the _source attribute - or even dropping it altogether in order to speed up namedtuple and reduce Python's startup time. Is there anyone here who uses the namedtuple _source attribute? My own tests suggest that changing from the current implementation to one similar to this recipe here: https://code.activestate.com/recipes/578918-yet-another-namedtuple/ which only uses exec to generate the __new__ method, not the entire class, has the potential to speed up namedtuple by a factor of four. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list From jladasky at itu.edu Mon Jul 17 15:33:58 2017 From: jladasky at itu.edu (jladasky at itu.edu) Date: Mon, 17 Jul 2017 12:33:58 -0700 (PDT) Subject: [RELEASE] Python 3.6.2 is now available In-Reply-To: References: <4dde4f6e-f651-4fab-9726-b73cdbaa38bd@googlegroups.com> Message-ID: On Monday, July 17, 2017 at 3:02:01 AM UTC-7, bream... at gmail.com wrote: > On Monday, July 17, 2017 at 10:41:02 AM UTC+1, wxjm... at gmail.com wrote: > > Poor Python. > > Once it was working. > > Dear RUE, > > A bad workman always blames his tools. > > Mark Lawrence. +1. From rgaddi at highlandtechnology.invalid Mon Jul 17 15:44:09 2017 From: rgaddi at highlandtechnology.invalid (Rob Gaddi) Date: Mon, 17 Jul 2017 12:44:09 -0700 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 07/17/2017 09:57 AM, Steve D'Aprano wrote: > collections.namedtuple generates a new class using exec, and records the source > code for the class as a _source attribute. > > Although it has a leading underscore, it is actually a public attribute. The > leading underscore distinguishes it from a named field potentially > called "source", e.g. namedtuple("klass", ['source', 'destination']). > > > There is some discussion on Python-Dev about: > > - changing the way the namedtuple class is generated which may > change the _source attribute > > - or even dropping it altogether > > in order to speed up namedtuple and reduce Python's startup time. > > > Is there anyone here who uses the namedtuple _source attribute? > > My own tests suggest that changing from the current implementation to one > similar to this recipe here: > > https://code.activestate.com/recipes/578918-yet-another-namedtuple/ > > which only uses exec to generate the __new__ method, not the entire class, has > the potential to speed up namedtuple by a factor of four. > I use namedtuple a lot, and never even HEARD of _source. That said, it sure feels (as someone who hasn't tried it) like there's a straightforward namedtuple implementation that calls type() directly rather than having to exec. I know that exec-gunshyness is overblown, but is there a simple answer as to why it's necessary here? -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. From aaron.m.weisberg at gmail.com Mon Jul 17 16:10:19 2017 From: aaron.m.weisberg at gmail.com (aaron.m.weisberg at gmail.com) Date: Mon, 17 Jul 2017 13:10:19 -0700 (PDT) Subject: Combining every pair of list items and creating a new list. Message-ID: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> Hi, I'm having difficulty thinking about how to do this as a Python beginner. But I have a list that is represented as: [1,2,3,4,5,6,7,8] and I would like the following results: [1,2] [3,4] [5,6] [7,8] Any ideas? Thanks From ben+python at benfinney.id.au Mon Jul 17 16:12:48 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 18 Jul 2017 06:12:48 +1000 Subject: Users of namedtuple: do you use the _source attribute? References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <85d18y7rgf.fsf@benfinney.id.au> Steve D'Aprano writes: > collections.namedtuple generates a new class using exec, and records > the source code for the class as a _source attribute. The documentation tells me that ?_source? is ?New in version 3.3.? I wasn't aware that the ?namedtuple? interface had changed since it was introduced, so: > Is there anyone here who uses the namedtuple _source attribute? Not me. -- \ ?Anyone who believes exponential growth can go on forever in a | `\ finite world is either a madman or an economist.? ?Kenneth | _o__) Boulding, 1973 | Ben Finney From akprasad at gmail.com Mon Jul 17 16:22:17 2017 From: akprasad at gmail.com (akprasad at gmail.com) Date: Mon, 17 Jul 2017 13:22:17 -0700 (PDT) Subject: brew pip: "ImportError: No module named packaging.version" Message-ID: <0f4b4825-da9a-40a5-aa20-193336b787d3@googlegroups.com> Hiya. I'm running El Capitan and have a Homebrew install of python (as well as one in /usr/bin/python, which I can't recall how I installed). I had some trouble pip installing Keras: $ sudo pip install keras ? DEPRECATION: Uninstalling a distutils installed project (numpy) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project. The recommended solution is to make sure I'm using brew python: $ brew link --overwrite python. Linking /usr/local/Cellar/python/2.7.13... 39 symlinks created But now pip (/usr/local/Cellar/python/2.7.13/bin/pip) does not work: $ pip -V Traceback (most recent call last): File "/usr/local/bin/pip", line 6, in from pkg_resources import load_entry_point File "/Users/aditpras/Library/Python/2.7/lib/python/site-packages/pkg_resources/__init__.py", line 70, in import packaging.version ImportError: No module named packaging.version (Not only that, but the usual instructions for reinstalling pip, such as "proxy python -m pip install -U pip" and "python get-pip.py" tell me "Requirement already up-to-date" without fixing anything.) I eventually realized that easy_install'ing pip gives me a working pip again. The one in Cellar still does not work. To install Keras, I apparently have to do: $ sudo pip install keras --ignore-installed numpy Any tips? (I realize I can just use pyenv or virtualenv). From 36rahu at gmail.com Mon Jul 17 16:27:16 2017 From: 36rahu at gmail.com (Rahul K P) Date: Tue, 18 Jul 2017 01:57:16 +0530 Subject: Combining every pair of list items and creating a new list. In-Reply-To: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> References: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> Message-ID: You can use a simple logic and list comprehension. so it will be like this lst = [1, 2, 3, 4, 5, 6, 7, 8] print [lst[i:i+2] for i in range(0,len(lst),2)] Here 2 is the pairing number, You can set is as your need. On Tue, Jul 18, 2017 at 1:40 AM, wrote: > Hi, > > I'm having difficulty thinking about how to do this as a Python beginner. > > But I have a list that is represented as: > > [1,2,3,4,5,6,7,8] > > and I would like the following results: > > [1,2] [3,4] [5,6] [7,8] > > Any ideas? > > Thanks > -- > https://mail.python.org/mailman/listinfo/python-list > -- Regards *Rahul K P* Python Developer Mumbai +919895980223 From python at mrabarnett.plus.com Mon Jul 17 16:42:18 2017 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 17 Jul 2017 21:42:18 +0100 Subject: Combining every pair of list items and creating a new list. In-Reply-To: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> References: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> Message-ID: <66da91ff-309b-7c15-bc19-4ad35a1edd58@mrabarnett.plus.com> On 2017-07-17 21:10, aaron.m.weisberg at gmail.com wrote: > Hi, > > I'm having difficulty thinking about how to do this as a Python beginner. > > But I have a list that is represented as: > > [1,2,3,4,5,6,7,8] > > and I would like the following results: > > [1,2] [3,4] [5,6] [7,8] > > Any ideas? > > Thanks > Those are slices of the original list. You can do it using a 'for' loop over a 'range' with a step/stride of 2 and getting slices of the original list. From mikhailwas at gmail.com Mon Jul 17 18:01:29 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 18 Jul 2017 00:01:29 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: ChrisA wrote: >Yep! Nobody would take any notice of the fact that you just put dots >on all those letters. It's not like it's going to make any difference >to anything. We're not dealing with matters of life and death here. >Oh wait. >https://www.theinquirer.net/inquirer/news/1017243/cellphone-localisation-glitch >I'll leave you with that thought. For Turkish and Slavic languages there is actually a demand for at least one Yeru letter to distinguish the common i and Yeru. In cyrillic it is "?". It should be romanized as "y". And the Yot /j/ should be romanized as "j". I.e. for Turkish: yaz?m - should be : jazym For Russian: ????? - should be : jarlyk Simple, asscii input, no ambiguity. How many exercises in futility could be avoided... And just in case still its not clear: this is not solved by adding dirt around the letter: if there is enough significance of the phoneme distinction then one should add a distinct letter for a syntax in question. And not like: well it is not so significant then we'll add a bit of dirt, it is more significant - we add some more dirt. It is not how the textual representation is made effecient. Mikhail From greg.ewing at canterbury.ac.nz Mon Jul 17 18:32:25 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Tue, 18 Jul 2017 10:32:25 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596cd965$0$1584$c3e8da3$5496439d@news.astraweb.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <596cd965$0$1584$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > I don't think that it is even a given that "atomic units of language" exist. To > quote a Hindi speaker earlier in this thread, ?? is a letter, and yet it can be > decomposed into ?? = ? + ?, so it isn't "atomic". If letters aren't atomic, > then what are? They're like subatomic particles! All equally fundamental, but they can turn into each other. -- Greg From greg.ewing at canterbury.ac.nz Mon Jul 17 18:41:29 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Tue, 18 Jul 2017 10:41:29 +1200 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <85d18y7rgf.fsf@benfinney.id.au> Message-ID: > Steve D'Aprano writes: >>Is there anyone here who uses the namedtuple _source attribute? I didn't know it existed either, and if I did I would have assumed it was an implementation detail and would never have written code that relied on it. I certainly won't miss it if it disapppears. -- Greg From ethan at stoneleaf.us Mon Jul 17 19:37:57 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Jul 2017 16:37:57 -0700 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <596D4A55.7030806@stoneleaf.us> On 07/17/2017 12:44 PM, Rob Gaddi wrote: > On 07/17/2017 09:57 AM, Steve D'Aprano wrote: >> collections.namedtuple generates a new class using exec, and records the source >> code for the class as a _source attribute. >> >> Although it has a leading underscore, it is actually a public attribute. The >> leading underscore distinguishes it from a named field potentially >> called "source", e.g. namedtuple("klass", ['source', 'destination']). >> >> [...] >> >> Is there anyone here who uses the namedtuple _source attribute? > I use namedtuple a lot, and never even HEARD of _source. > > That said, it sure feels (as someone who hasn't tried it) like there's a straightforward namedtuple implementation that > calls type() directly rather than having to exec. I know that exec-gunshyness is overblown, but is there a simple > answer as to why it's necessary here? I can't answer that question, but I can say my aenum library [1][2] uses the same metaclass technique as the new Enum type, and also supports doc strings and default arguments in the class-based format. -- ~Ethan~ [1] https://pypi.python.org/pypi/aenum (works back to at least 2.7) [2] Disclosure: I am the author of the Python stdlib Enum, the enum34 backport, and the Advanced Enumeration (aenum) library. From steve+python at pearwood.info Mon Jul 17 21:24:00 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Tue, 18 Jul 2017 11:24:00 +1000 Subject: Users of namedtuple: do you use the _source attribute? References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <596d6331$0$1607$c3e8da3$5496439d@news.astraweb.com> On Tue, 18 Jul 2017 05:44 am, Rob Gaddi wrote: > That said, it sure feels (as someone who hasn't tried it) like there's a > straightforward namedtuple implementation that calls type() directly > rather than having to exec. I know that exec-gunshyness is overblown, > but is there a simple answer as to why it's necessary here? It's very hard to write __new__ with a sensible parameter list without exec. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From jan at hyper-world.de Mon Jul 17 21:37:23 2017 From: jan at hyper-world.de (Jan Gosmann) Date: Mon, 17 Jul 2017 21:37:23 -0400 Subject: cPickle fails on manually compiled and executed Python function Message-ID: Hi, today I came across some weird behaviour (a bug?) in Python 2.7.13 (on Linux) with the cPickle module. The pickle module works and so does the pickle module in Python 3. I have a file fn.py with a minimal function definition: ``` def fn(): pass ``` The actual code that I run is in a separate file (test.py): ``` import cPickle import pickle def load_pyfile(filename): source = '' with open(filename, 'r') as f: source += f.read() code = compile(source, filename, 'exec') loaded = {'__file__': filename} exec(code, loaded) return loaded fn = load_pyfile('fn.py')['fn'] print(pickle.dumps(fn)) print('----') print(cPickle.dumps(fn)) ``` The first print works fine, but the one with cPickle leads to an exception. Here is the output: ``` c__main__ fn p0 . ---- Traceback (most recent call last): File "test.py", line 17, in print(cPickle.dumps(fn)) TypeError: expected string or Unicode object, NoneType found ``` I don't understand why the cPickle module behaves differently in this case. Is this expected? And if so, how do I fix it? Or can this be considered a bug? (In that case I could open an issue in the bug tracker.) Cheers, Jan From rantingrickjohnson at gmail.com Mon Jul 17 22:04:22 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Mon, 17 Jul 2017 19:04:22 -0700 (PDT) Subject: Combining every pair of list items and creating a new list. In-Reply-To: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> References: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> Message-ID: <3d0a5e0d-17f9-4896-a274-2ff598405785@googlegroups.com> On Monday, July 17, 2017 at 3:10:51 PM UTC-5, aaron.m.... at gmail.com wrote: > Hi, > > I'm having difficulty thinking about how to do this as a Python beginner. > > But I have a list that is represented as: > > [1,2,3,4,5,6,7,8] > > and I would like the following results: > > [1,2] [3,4] [5,6] [7,8] > > Any ideas? Solving problems in the "programming realm" is not much unlike solving problems in real life. First, image you had 8 apples laying on the floor. Now imagine the steps required to collect the apples into "sacks of two". (1) First, grab one large sack and enough small sacks to hold all the apples. 8 / 2.0 = (4.0 small sacks) (2) Then, position yourself at one end of the "row of apples". (3) Now grab two apples and put them both into a small sack. Then, put the small sack into the large sack. (4) Repeat step 3 until there are no more apples on the floor. There you go. All you have to do now is translate that into Python code. Shouldn't be too difficult. And your professor will be so proud ;-) From rantingrickjohnson at gmail.com Mon Jul 17 22:27:59 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Mon, 17 Jul 2017 19:27:59 -0700 (PDT) Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> On Monday, July 17, 2017 at 12:20:04 PM UTC-5, Steve D'Aprano wrote: > collections.namedtuple generates a new class using exec, > and records the source code for the class as a _source > attribute. Although it has a leading underscore, it is > actually a public attribute. The leading underscore > distinguishes it from a named field potentially called > "source", e.g. namedtuple("klass", ['source', > 'destination']). Although i understand the reasoning behind using the leading underscore, the Python devs should have realized that anyone who follows Pythonic convention [1] will ignore a symbol that starts with an underscore . So if the intention is that `_source` should be a part of the public API, then obviously, defining it in "standardized private form" is very unwise. But to answer your question, no, none of my code relies on the `_source` attribute. So i really don't care what happens to it. [1] Which i would hope is a rather large group, and not just another "Rick singleton". From michele.simionato at gmail.com Mon Jul 17 23:56:41 2017 From: michele.simionato at gmail.com (Michele Simionato) Date: Mon, 17 Jul 2017 20:56:41 -0700 (PDT) Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <22e726dc-6700-4899-9b4b-707a7ed7b770@googlegroups.com> Il giorno luned? 17 luglio 2017 19:20:04 UTC+2, Steve D'Aprano ha scritto: > collections.namedtuple generates a new class using exec, and records the source > code for the class as a _source attribute. > > Although it has a leading underscore, it is actually a public attribute. The > leading underscore distinguishes it from a named field potentially > called "source", e.g. namedtuple("klass", ['source', 'destination']). > > > There is some discussion on Python-Dev about: > > - changing the way the namedtuple class is generated which may > change the _source attribute > > - or even dropping it altogether > > in order to speed up namedtuple and reduce Python's startup time. > > > Is there anyone here who uses the namedtuple _source attribute? > > My own tests suggest that changing from the current implementation to one > similar to this recipe here: > > https://code.activestate.com/recipes/578918-yet-another-namedtuple/ > > which only uses exec to generate the __new__ method, not the entire class, has > the potential to speed up namedtuple by a factor of four. > > > > -- > Steve > ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure > enough, things got worse. It is an attribute that looks handy for understanding how a namedtuple works, and can be used in debugging, but in practice I have never used it in production code. I use a a lot of namedtuple, and if we can get a factor 4 speedup I am pretty happy to lose _source. From cs at zip.com.au Tue Jul 18 00:55:12 2017 From: cs at zip.com.au (Cameron Simpson) Date: Tue, 18 Jul 2017 14:55:12 +1000 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> Message-ID: <20170718045512.GA67334@cskk.homeip.net> On 18Jul2017 02:57, Steve D'Aprano wrote: >collections.namedtuple generates a new class using exec, and records the source >code for the class as a _source attribute. > >Although it has a leading underscore, it is actually a public attribute. The >leading underscore distinguishes it from a named field potentially >called "source", e.g. namedtuple("klass", ['source', 'destination']). > >There is some discussion on Python-Dev about: >- changing the way the namedtuple class is generated which may > change the _source attribute >- or even dropping it altogether >in order to speed up namedtuple and reduce Python's startup time. > >Is there anyone here who uses the namedtuple _source attribute? Speaking for myself: no I do not. Cheers, Cameron Simpson From dieter at handshake.de Tue Jul 18 01:07:24 2017 From: dieter at handshake.de (dieter) Date: Tue, 18 Jul 2017 07:07:24 +0200 Subject: cPickle fails on manually compiled and executed Python function References: Message-ID: <87fudutjsj.fsf@handshake.de> "Jan Gosmann" writes: > today I came across some weird behaviour (a bug?) in Python 2.7.13 (on > Linux) with the cPickle module. The pickle module works and so does > the pickle module in Python 3. > > I have a file fn.py with a minimal function definition: > > ``` > def fn(): > pass > ``` > > The actual code that I run is in a separate file (test.py): > > ``` > import cPickle > import pickle > > def load_pyfile(filename): > source = '' > with open(filename, 'r') as f: > source += f.read() > code = compile(source, filename, 'exec') > loaded = {'__file__': filename} > exec(code, loaded) > return loaded > > fn = load_pyfile('fn.py')['fn'] > > print(pickle.dumps(fn)) > print('----') > print(cPickle.dumps(fn)) > ``` > > The first print works fine, but the one with cPickle leads to an > exception. Here is the output: > > ``` > c__main__ > fn > p0 > . > ---- > Traceback (most recent call last): > File "test.py", line 17, in > print(cPickle.dumps(fn)) > TypeError: expected string or Unicode object, NoneType found "pickle" (and "cpickle") are serializing functions as so called "global"s, i.e. as a module reference together with a name. This means, they cannot handle functions computed in a module (as in your case). I am quite convinced that "pickle" will not be able to deserialize (i.e. load) your function (even though it appears to perform the serialization (i.e. dump). You are using "pickle/cPickle" for a case not anticipated (computed functions). This means that this case is not as tested as other cases. As a consequence, you can see different behavior between "picke" and "cPickle". From tjreedy at udel.edu Tue Jul 18 01:58:55 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 18 Jul 2017 01:58:55 -0400 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> Message-ID: On 7/17/2017 10:27 PM, Rick Johnson wrote: > On Monday, July 17, 2017 at 12:20:04 PM UTC-5, Steve D'Aprano wrote: >> collections.namedtuple generates a new class using exec, >> and records the source code for the class as a _source >> attribute. Although it has a leading underscore, it is >> actually a public attribute. The leading underscore >> distinguishes it from a named field potentially called >> "source", e.g. namedtuple("klass", ['source', >> 'destination']). > > Although i understand the reasoning behind using the leading > underscore, the Python devs should have realized that anyone > who follows Pythonic convention [1] will ignore a symbol > that starts with an underscore . So if the intention is that > `_source` should be a part of the public API, then > obviously, defining it in "standardized private form" is > very unwise. > > But to answer your question, no, none of my code relies on > the `_source` attribute. So i really don't care what happens > to it. > > [1] Which i would hope is a rather large group, and not just > another "Rick singleton". Yes, No. The developers of the class agree that a trailing underscore convention would have been better. 'source_' etc. -- Terry Jan Reedy From marko at pacujo.net Tue Jul 18 02:05:37 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Tue, 18 Jul 2017 09:05:37 +0300 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <87zic2e0um.fsf@elektro.pacujo.net> Mikhail V : > And just in case still its not clear: this is not solved by adding > dirt around the letter: if there is enough significance of the phoneme > distinction then one should add a distinct letter for a syntax in > question. The letters of Finnish are: abdefghijklmnoprstuvy?? in that order. No exasperated wish of yours is going to change that. Marko From h.goebel at crazy-compilers.com Tue Jul 18 06:01:51 2017 From: h.goebel at crazy-compilers.com (Hartmut Goebel) Date: Tue, 18 Jul 2017 12:01:51 +0200 Subject: Funding continuous maintenance for PyInstaller? Message-ID: <596DDC8F.7080703@crazy-compilers.com> Hi, I'm seeking advice how to fund continuous maintenance for an open source project. *Do you have any idea how to fund continuous maintenance for PyInstaller? Do you have any idea whom or where to ask? Do you know somebody to help setting up a commercial support/maintenance model? * I'm the (remaining) maintainer of PyInstaller (www.pyinstaller,org). Currently I'm maintaining PyInstaller in my spare-time. But it's is getting to much work for working on it for free: the open issue tickets and pull-requests are piling up. Since PyInstaller is quite mature, problems are hard to track down and to solve. Thus solving one ticket often takes half a day or even more. I'm already got in tough with the the PSF and the Python Software Verband (much like PSF, just for Germany), but they have not experience with this. I also read the PSF grants program, but this doesn't fit for continous maintenance. I also had a look at bountysource, but the numbers offered there may be teasers for students, not for professionals - so I did not follow this road. I plan to add a "donate" page to the web-site, but I doubt this will bring in noteworthy amounts. So I was thinking about some commercial support/maintenance model, but I have no experience with this. As I'm a freelance consultant already (but in the information security business), this could be feasible to implement, if I'd know how to address the commercial users. Thanks for any tip! *About PyInstaller* PyInstaller is the successor of "McMillan Installer", a tool like, freeze, py2exe, py2app or bbfreeze - but PyInstaller supports Windows, MacOS and Unix (GN/Linux, Solaris, HP-UX, etc.). PyInstaller is widely used sa you can see when looking at the issues and on the mailinglist. E.g. kivy uses/recommends PyInstaller for building Python-Apps for mobile platforms. PyInstaller is also used for commercial applications (as some hints on the mailinglist or But most commercial users are unknown. *About me* I'm based an Germany and developing open source and free software since about 1990 and using Python since about 1998. Beside of this I developed software like pdfposter, python-ghostscript, python-managesieve, etc. My day-job is freelance consultant focused on information security. -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel at crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible | From obulesu.t at gmail.com Tue Jul 18 06:27:54 2017 From: obulesu.t at gmail.com (T Obulesu) Date: Tue, 18 Jul 2017 03:27:54 -0700 (PDT) Subject: How Can I edit and update my .config (for my python application) file using WebSockets exactly like how we edit and update router .config file? Message-ID: <26c9fa12-708c-4281-922c-b1399b5140e5@googlegroups.com> I have my python application running on Raspberry Pi and it needs to be configured every time. Hence I want to access this .config file over online and configure it exactly like how we can configure our router, but I want to use only web sockets. From rhodri at kynesim.co.uk Tue Jul 18 06:57:32 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 18 Jul 2017 11:57:32 +0100 Subject: Funding continuous maintenance for PyInstaller? In-Reply-To: <596DDC8F.7080703@crazy-compilers.com> References: <596DDC8F.7080703@crazy-compilers.com> Message-ID: <988b3f96-cadd-ab40-c907-3f9e8fd75f88@kynesim.co.uk> On 18/07/17 11:01, Hartmut Goebel wrote: > Hi, > > I'm seeking advice how to fund continuous maintenance for an open source > project. > > *Do you have any idea how to fund continuous maintenance for PyInstaller? > Do you have any idea whom or where to ask? > Do you know somebody to help setting up a commercial support/maintenance > model? * Try the Linux Foundation, https://www.linuxfoundation.org/ Maintenance of Open Source projects is one of the things they are interested in helping with. Disclaimer: I am currently under contract to them to provide support for a mature open source project, so I'm a tad biased. -- Rhodri James *-* Kynesim Ltd From steve+python at pearwood.info Tue Jul 18 08:51:36 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Tue, 18 Jul 2017 22:51:36 +1000 Subject: Users of namedtuple: do you use the _source attribute? References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> Message-ID: <596e045b$0$1595$c3e8da3$5496439d@news.astraweb.com> On Tue, 18 Jul 2017 03:58 pm, Terry Reedy wrote: >> On Monday, July 17, 2017 at 12:20:04 PM UTC-5, Steve D'Aprano wrote: >>> collections.namedtuple generates a new class using exec, >>> and records the source code for the class as a _source >>> attribute. Although it has a leading underscore, it is >>> actually a public attribute. The leading underscore >>> distinguishes it from a named field potentially called >>> "source", e.g. namedtuple("klass", ['source', >>> 'destination']). [...] > Yes, No. The developers of the class agree that a trailing underscore > convention would have been better. 'source_' etc. I actually disagree with Raymond, and I think his first instinct was the correct one. "source_" is already a public name, which means that users could want to create fields with that name for some reason, just as they could create "source_code" or "source_be_with_you" or any other name containing underscores. There is no restriction on names ending in an underscore, and we have a convention to use such names when they would otherwise clash with a keyword, e.g. "class_". So I don't think that namedtuple should reserve names ending with underscore for its own use. I think that Raymond's first decision was correct, and documenting _source as public is the least-worst option. [1] Maybe if we borrowed the keys to Guido's Time Machine and went back to Python 0.9 we could argue that there should be. "Dunder names and names ending in a single underscore are reserved for Python." But that would clash -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 09:11:02 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Tue, 18 Jul 2017 23:11:02 +1000 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> On Tue, 18 Jul 2017 08:01 am, Mikhail V wrote: > And just in case still its not clear: this is not > solved by adding dirt around the letter: if there is > enough significance of the phoneme distinction then > one should add a distinct letter for a syntax in question. It isn't "dirt", any more than difference between ? (SHA) and ? (SHCHA) is "dirt", or between F and E is "dirt". In Swedish, ?, ?, and ? are distinct letters of the alphabet. In Danish and Norwegian, ? ? and ? are distinct letters of the alphabet. Just as in English W is a distinct letter of the alphabet, different from either VV or UU. (I don't think any native English words use a double-V or double-U, but the possibility exists.) That's neither better nor worse than the system used by English and French, where letters with dicritics are not distinct letters, but guides to pronunciation. Neither system is right or wrong, or better than the other. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From saikat.sc at gmail.com Tue Jul 18 09:54:56 2017 From: saikat.sc at gmail.com (Saikat Chakraborty) Date: Tue, 18 Jul 2017 19:24:56 +0530 Subject: Problem in installing module "pynamical" Message-ID: I am using PyCharm Community Edition 2017 with interpreter python 3.6.1. I want to install pynamical module. But it is showing error. I am posting the error message: E:\untitled>pip install pynamical FileNotFoundError: [WinError 2] The system cannot find the file specified error: command 'c:\\users\\s.chakraborty\\appdata\\local\\programs\\python\\python36\\python.exe' failed with exit status 1 Please give me a solutioin. Thanking you. -- With Regards Saikat Chakraborty (Doctoral Research Scholar) *Computer Science & Engineering Dept.* * NIT Rourkela,Rourkela,Orissa, India* From rosuav at gmail.com Tue Jul 18 09:59:33 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 18 Jul 2017 23:59:33 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano wrote: > On Tue, 18 Jul 2017 08:01 am, Mikhail V wrote: > >> And just in case still its not clear: this is not >> solved by adding dirt around the letter: if there is >> enough significance of the phoneme distinction then >> one should add a distinct letter for a syntax in question. > > It isn't "dirt", any more than difference between ? (SHA) and ? (SHCHA) > is "dirt", or between F and E is "dirt". > > In Swedish, ?, ?, and ? are distinct letters of the alphabet. In Danish and > Norwegian, ? ? and ? are distinct letters of the alphabet. Just as in English W > is a distinct letter of the alphabet, different from either VV or UU. > > (I don't think any native English words use a double-V or double-U, but the > possibility exists.) vacuum. ChrisA From random832 at fastmail.com Tue Jul 18 10:09:59 2017 From: random832 at fastmail.com (Random832) Date: Tue, 18 Jul 2017 10:09:59 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> Message-ID: <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote: > What do you mean about regular expressions? You can use REs with > normalized strings. And if you have any valid definition of "real > character", you can use it equally on an NFC-normalized or > NFD-normalized string than any other. They're just strings, you know. I don't understand how normalization is supposed to help with this. It's not like there aren't valid combinations that do not have a corresponding single NFC codepoint (to say nothing of the situation with e.g. Indic languages). In principle probably a viable solution for regex would be to add character classes for base and combining characters, and then "[[:base:]][[:combining:]]*" can be used as a building block if necessary. From rustompmody at gmail.com Tue Jul 18 10:10:44 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Tue, 18 Jul 2017 07:10:44 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> Message-ID: <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote: > On 17/07/17 05:10, Rustom Mody wrote: > > Hint1: Ask your grandmother whether unicode's notion of character makes sense. > > Ask 10 gmas from 10 language-L's > > Hint2: When in doubt gma usually is right > > "For every complex problem there is an answer that is clear, simple and > wrong." (H.L. Mencken). Great men galore with great quotes galore? Here are 3 ? take your pick: Einstein: If you can't explain something to a six-year-old, you really don't understand it yourself. [Commonly attributed to Einstein More likely Feynman, Rutherford, de Broglie or some other notable physicist https://skeptics.stackexchange.com/questions/8742/did-einstein-say-if-you-cant-explain-it-simply-you-dont-understand-it-well-en ] Dijkstra: Programming languages belong to the problem set, not (as some imagine) to the solution set https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD473.html Joseph Weizenbaum ? AI pioneer, author of Eliza: Computer technology, like all sciences, are self-validating systems. They define problems and their solutions within a circumscribed context and leave out much of the real-world data. ?Science can only proceed by simplifying reality.? [Weizenbaum then recounts] a joke about a drunkard to clarify this statement: One dark evening a policeman comes across a man on his hands and knees searching beneath a lamppost. He asks the man what he?s doing and the man replies that he lost his keys over there, pointing off into the darkness. ?So why are you looking for them under the streetlight?? inquired the policeman. The man replies, ?Because the light is so much better over here.? http://www.digitalathena.com/the-wisdom-of-joseph-weizenbaum.html > Unfortunately grandmothers outside their areas of expertise are particularly prone to finding those answers. Gma for the purposes of this discussion can be defined: - A (not necessarily) elderly person who - Is fairly intelligent - Not necessarily highly educated - Generally interested in life and people - [But not usually] in technical arcana An alternative "definition to gma" (if big names are a requirement) could be Joseph Weizenbaum quoted above, who in Computer Power and Human Reason vociferously spoke against the propensity to define human value in terms of "computerizability" From random832 at fastmail.com Tue Jul 18 10:11:39 2017 From: random832 at fastmail.com (Random832) Date: Tue, 18 Jul 2017 10:11:39 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87zic7v3gl.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> Message-ID: <1500387099.4041269.1044669328.2C241A9E@webmail.messagingengine.com> On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote: > Consider, for example, a Python source code > editor where you want to limit the length of the line based on the > number of characters more typically than based on the number of pixels. Even there you need to go based on the width in character cells. Most characters for East Asian languages occupy two character cells. It would be nice if there was an easy way to get str.format to use this width instead of the length in code points for the purpose of padding. From wegge at wegge.dk Tue Jul 18 10:23:24 2017 From: wegge at wegge.dk (Anders Wegge Keller) Date: Tue, 18 Jul 2017 16:23:24 +0200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: <20170718162324.100e2c8f@wegge.dk> P? Tue, 18 Jul 2017 23:59:33 +1000 Chris Angelico skrev: > On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano >> (I don't think any native English words use a double-V or double-U, but >> the possibility exists.) > vacuum. That's latin. -- //Wegge From nomail at com.invalid Tue Jul 18 10:23:38 2017 From: nomail at com.invalid (ast) Date: Tue, 18 Jul 2017 16:23:38 +0200 Subject: Combining every pair of list items and creating a new list. In-Reply-To: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> References: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> Message-ID: <596e19f0$0$4826$426a74cc@news.free.fr> a ?crit dans le message de news:621ca9d5-79b1-44c9-b534-3ad1b0cf44a4 at googlegroups.com... > Hi, > > I'm having difficulty thinking about how to do this as a Python beginner. > > But I have a list that is represented as: > > [1,2,3,4,5,6,7,8] > > and I would like the following results: > > [1,2] [3,4] [5,6] [7,8] > > Any ideas? > > Thanks list(zip(L[0::2], L[1::2])) [(1, 2), (3, 4), (5, 6), (7, 8)] >>> list(map(list, zip(L[0::2], L[1::2]))) [[1, 2], [3, 4], [5, 6], [7, 8]] From random832 at fastmail.com Tue Jul 18 10:29:30 2017 From: random832 at fastmail.com (Random832) Date: Tue, 18 Jul 2017 10:29:30 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596afb8b$0$11093$c3e8da3@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> Message-ID: <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> On Sun, Jul 16, 2017, at 01:37, Steven D'Aprano wrote: > In a *well-designed* *bug-free* monospaced font, all code points should > be either zero-width or one column wide. Or two columns, if the font > supports East Asian fullwidth characters. What about Emoji? U+1F469 WOMAN is two columns wide on its own. U+1F4BB PERSONAL COMPUTER is two columns wide on its own. U+200D ZERO WIDTH JOINER is zero columns wide on its own. The sequence U+1F469 U+200D U+1F4BB is the single emoji "Woman Technologist", which is two columns wide. Even without ZWJ this comes up - the regional indicator characters are meant to appear in pairs - signifying a flag, which is two columns wide - but when they appear in isolation they usually appear as an equally wide "letter in a box" picture. The skin tone indicators aren't applied with ZWJ, and are meant to combine with the preceding character when it is an emoji depicting a person, but show up as a square swatch of that color in isolation. And AIUI they don't have a combining class in the unicode data. Or, consider presentation variation selectors U+26A1 HIGH VOLTAGE SIGN U+FE0E VARIATION SELECTOR-15 (text presentation in this context) U+FE0F VARIATION SELECTOR-16 (emoji presentation in this context) Some code points are meant to be shown as a text character in some contexts and an emoji in others. The default presentation (when not followed by a variation selector) depends on the application. Otherwise, the Emoji is two columns wide and the text presentation version is usually one column wide. The variation selectors themselves are zero columns wide when applied to any character for which it is not meant to be applied. (From a font perspective these can be regarded as ligatures, but the font itself is not responsible for the behavior of a character-cell terminal emulator) From random832 at fastmail.com Tue Jul 18 10:38:48 2017 From: random832 at fastmail.com (Random832) Date: Tue, 18 Jul 2017 10:38:48 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <20170718162324.100e2c8f@wegge.dk> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> Message-ID: <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> On Tue, Jul 18, 2017, at 10:23, Anders Wegge Keller wrote: > P? Tue, 18 Jul 2017 23:59:33 +1000 > Chris Angelico skrev: > > On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano > >> (I don't think any native English words use a double-V or double-U, but > >> the possibility exists.) > > > vacuum. > > That's latin. Define "native" then. My interpretation of "native English words" is "anything you wouldn't have to put in italics to use in a sentence". Which would also include "continuum". As for double-v, a quick search through /usr/share/dict/words reveals "civvies", "divvy", "revved/revving", "savvy" and "skivvy", and various conjugations thereof. All following, more or less, the rule of using a double consonant after a short vowel in contexts where a single consonant would suggest the preceding vowel was long. From grant.b.edwards at gmail.com Tue Jul 18 10:41:12 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Tue, 18 Jul 2017 14:41:12 +0000 (UTC) Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 2017-07-18, Steve D'Aprano wrote: > (I don't think any native English words use a double-V or double-U, but the > possibility exists.) double-v: flivver, navvy, bivvy, bevvy, trivvet, divvy, skivvy, skivvies, etc. and various gerund and past tense verbs: revved, revving, chivved chivving double-u: vacuum, continuum, squush, fortuuned > That's neither better nor worse than the system used by English and French, > where letters with dicritics are not distinct letters, but guides to > pronunciation. Neither system is right or wrong, or better than the other. You'll get kicked off Usenet for having an attitude like that! -- Grant Edwards grant.b.edwards Yow! I put aside my copy at of "BOWLING WORLD" and gmail.com think about GUN CONTROL legislation... From jan at hyper-world.de Tue Jul 18 11:28:18 2017 From: jan at hyper-world.de (Jan Gosmann) Date: Tue, 18 Jul 2017 11:28:18 -0400 Subject: cPickle fails on manually compiled and executed Python function In-Reply-To: <87fudutjsj.fsf@handshake.de> References: <87fudutjsj.fsf@handshake.de> Message-ID: <1100852f-9362-0f14-9c6f-7cb7d252bf20@hyper-world.de> On 07/18/2017 01:07 AM, dieter wrote: > "Jan Gosmann" writes: > >> [...] >> fn = load_pyfile('fn.py')['fn'] >> [...] > "pickle" (and "cpickle") are serializing functions as so called > "global"s, i.e. as a module reference together with a name. > This means, they cannot handle functions computed in a module > (as in your case). Note that I'm assigning the computed function to a global/module level variable. As far as I understand the documentation that should be all that matters because only the function name will be serialized. > I am quite convinced that "pickle" will not be able to deserialize (i.e. load) > your function (even though it appears to perform the serialization > (i.e. dump). Actually the deserialization works fine with either module. That is both pickle.loads(pickle.dumps(fn)) and cPickle.loads(pickle.dumps(fn)) give me back the function. By now I realized that a pretty simple workaround works. Instead of doing `fn = load_pyfile('fn.py')['fn']` the following function definition works with both pickle modules: _fn = load_pyfile('fn.py')['fn'] def fn(*args, **kwargs): return _fn(*args, **kwargs) From rhodri at kynesim.co.uk Tue Jul 18 11:37:37 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 18 Jul 2017 16:37:37 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> Message-ID: <75019890-592b-1609-a11b-3ce4b6fab52a@kynesim.co.uk> On 18/07/17 15:10, Rustom Mody wrote: > On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote: >> On 17/07/17 05:10, Rustom Mody wrote: >>> Hint1: Ask your grandmother whether unicode's notion of character makes sense. >>> Ask 10 gmas from 10 language-L's >>> Hint2: When in doubt gma usually is right >> >> "For every complex problem there is an answer that is clear, simple and >> wrong." (H.L. Mencken). > > Great men galore with great quotes galore? [snip] >> Unfortunately grandmothers outside their areas of expertise are particularly prone to finding those answers. > > Gma for the purposes of this discussion can be defined: > > - A (not necessarily) elderly person who > - Is fairly intelligent > - Not necessarily highly educated > - Generally interested in life and people > - [But not usually] in technical arcana That last one is the killer. Using clear and simple terminology is usually adequate when you aren't discussing technical arcana. Unfortunately we are discussing technical arcana, and that's when you trip over the fact that your clear, simple terminology is wrong. It's an instance of Weizenbaum's joke that you quoted, just replacing streetlights with grandmas. (For the record, one of my grandmothers would have been baffled by this conversation, and the other one would have had definite opinions on whether accents were distinct characters or not, followed by a digression into whether "?" and "?" should be suppressed vigorously :-) -- Rhodri James *-* Kynesim Ltd From rhodri at kynesim.co.uk Tue Jul 18 11:40:24 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 18 Jul 2017 16:40:24 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> Message-ID: On 18/07/17 16:27, Dennis Lee Bieber wrote: > On Tue, 18 Jul 2017 10:38:48 -0400, Random832 > declaimed the following: > >> Define "native" then. My interpretation of "native English words" is >> "anything you wouldn't have to put in italics to use in a sentence". >> Which would also include "continuum". >> > > Probably would have to go to words predating the Roman occupation > (which probably means a dialect closer to Welsh or other Gaelic). > Everything later is an import (anglo-saxon being germanic tribes invading > south, Vikings in the central area, as I recall southern Irish displacing > Picts in Scotland, and then the Norman French (themselves starting from > Vikings ["nor(se)man"]). Sorry, but even the Gaels/Gauls were invaders :-) -- Rhodri James *-* Kynesim Ltd From wegge at wegge.dk Tue Jul 18 11:45:56 2017 From: wegge at wegge.dk (Anders Wegge Keller) Date: Tue, 18 Jul 2017 17:45:56 +0200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> Message-ID: <20170718174556.455c41ff@wegge.dk> P? Tue, 18 Jul 2017 11:27:03 -0400 Dennis Lee Bieber skrev: > Probably would have to go to words predating the Roman occupation > (which probably means a dialect closer to Welsh or other Gaelic). > Everything later is an import (anglo-saxon being germanic tribes invading > south, Vikings in the central area, as I recall southern Irish displacing > Picts in Scotland, and then the Norman French (themselves starting from > Vikings ["nor(se)man"]). English is known to be lurking in back alleys, waiting for unsuspecting languages, that can be beat up for loose vocabulary. So defining anything "pure" about it, is going to be practically impossible. -- //Wegge From walters.justin01 at gmail.com Tue Jul 18 11:52:41 2017 From: walters.justin01 at gmail.com (justin walters) Date: Tue, 18 Jul 2017 08:52:41 -0700 Subject: Funding continuous maintenance for PyInstaller? In-Reply-To: <596DDC8F.7080703@crazy-compilers.com> References: <596DDC8F.7080703@crazy-compilers.com> Message-ID: On Tue, Jul 18, 2017 at 3:01 AM, Hartmut Goebel < h.goebel at crazy-compilers.com> wrote: > Hi, > > I'm seeking advice how to fund continuous maintenance for an open source > project. > > *Do you have any idea how to fund continuous maintenance for PyInstaller? > Do you have any idea whom or where to ask? > Do you know somebody to help setting up a commercial support/maintenance > model? * > > I'm the (remaining) maintainer of PyInstaller (www.pyinstaller,org). > Currently I'm maintaining PyInstaller in my spare-time. But it's is > getting to much work for working on it for free: the open issue tickets > and pull-requests are piling up. Since PyInstaller is quite mature, > problems are hard to track down and to solve. Thus solving one ticket > often takes half a day or even more. > > I'm already got in tough with the the PSF and the Python Software > Verband (much like PSF, just for Germany), but they have not experience > with this. I also read the PSF grants program, but this doesn't fit for > continous maintenance. I also had a look at bountysource, but the > numbers offered there may be teasers for students, not for professionals > - so I did not follow this road. I plan to add a "donate" page to the > web-site, but I doubt this will bring in noteworthy amounts. > > So I was thinking about some commercial support/maintenance model, but I > have no experience with this. As I'm a freelance consultant already (but > in the information security business), this could be feasible to > implement, if I'd know how to address the commercial users. > > Thanks for any tip! > > *About PyInstaller* > > PyInstaller is the successor of "McMillan Installer", a tool like, > freeze, py2exe, py2app or bbfreeze - but PyInstaller supports Windows, > MacOS and Unix (GN/Linux, Solaris, HP-UX, etc.). PyInstaller is widely > used sa you can see when looking at the issues and on the mailinglist. > E.g. kivy uses/recommends PyInstaller for building Python-Apps for > mobile platforms. > > PyInstaller is also used for commercial applications (as some hints on > the mailinglist or > But most commercial users are unknown. > > *About me* > > I'm based an Germany and developing open source and free software since > about 1990 and using Python since about 1998. Beside of this I developed > software like pdfposter, python-ghostscript, python-managesieve, etc. My > day-job is freelance consultant focused on information security. > > -- > Regards > Hartmut Goebel > > | Hartmut Goebel | h.goebel at crazy-compilers.com | > | www.crazy-compilers.com | compilers which you thought are impossible | > > -- > https://mail.python.org/mailman/listinfo/python-list > You could try reaching out to Michael a Talk Python to Me: https://talkpython.fm/ He may be able to give you a mention on the show or even have you on as a guest. He may also be able to point you in the direction of some sponsors. From marko at pacujo.net Tue Jul 18 12:03:42 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Tue, 18 Jul 2017 19:03:42 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> Message-ID: <8760epvijl.fsf@elektro.pacujo.net> Random832 : > As for double-v, a quick search through /usr/share/dict/words reveals > "civvies", "divvy", "revved/revving", "savvy" and "skivvy", and > various conjugations thereof. All following, more or less, the rule of > using a double consonant after a short vowel in contexts where a > single consonant would suggest the preceding vowel was long. The single/double consonant rule is indeed an ancient Germanic spelling principle. English makes several twists to the it: * "v" is never doubled ("shovel") * a final "v" receives a superfluous "e" ("love") * the final consonant of a single-syllable word is doubled only if the consonant is "k", "l" or "s" ("kick", "kill", "kiss") * "k" becomes "ck" when doubled ("lacking") * a final consonant is never doubled in a multisyllable word ("havoc", "shovel") * a final "k" of a multisyllable word becomes "c" ("magic") Marko From rosuav at gmail.com Tue Jul 18 12:44:52 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 02:44:52 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> Message-ID: On Wed, Jul 19, 2017 at 12:09 AM, Random832 wrote: > On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote: >> What do you mean about regular expressions? You can use REs with >> normalized strings. And if you have any valid definition of "real >> character", you can use it equally on an NFC-normalized or >> NFD-normalized string than any other. They're just strings, you know. > > I don't understand how normalization is supposed to help with this. It's > not like there aren't valid combinations that do not have a > corresponding single NFC codepoint (to say nothing of the situation with > e.g. Indic languages). > > In principle probably a viable solution for regex would be to add > character classes for base and combining characters, and then > "[[:base:]][[:combining:]]*" can be used as a building block if > necessary. Once you NFC or NFD normalize both strings, identical strings will generally have identical codepoints. (There are some exceptions, and for certain types of matching, you might want to use NFKC/NFKD instead.) You should then be able to use normal regular expressions to match correctly. I don't know of any situations where you want to match "any base character" or "any combining character"; what you're more likely to want is "match the letter ?", and you don't care whether it's represented as U+0061 U+0301 or as U+00E1. That's where Unicode normalization comes in. ChrisA From rosuav at gmail.com Tue Jul 18 12:48:52 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 02:48:52 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> Message-ID: On Wed, Jul 19, 2017 at 1:40 AM, Rhodri James wrote: > On 18/07/17 16:27, Dennis Lee Bieber wrote: >> >> On Tue, 18 Jul 2017 10:38:48 -0400, Random832 >> declaimed the following: >> >>> Define "native" then. My interpretation of "native English words" is >>> "anything you wouldn't have to put in italics to use in a sentence". >>> Which would also include "continuum". >>> >> >> Probably would have to go to words predating the Roman occupation >> (which probably means a dialect closer to Welsh or other Gaelic). >> Everything later is an import (anglo-saxon being germanic tribes invading >> south, Vikings in the central area, as I recall southern Irish displacing >> Picts in Scotland, and then the Norman French (themselves starting from >> Vikings ["nor(se)man"]). > > > Sorry, but even the Gaels/Gauls were invaders :-) If we go back far enough, I'm pretty sure the only true Englishman is a sentient cup of tea. ChrisA From ganesh1pal at gmail.com Tue Jul 18 12:52:28 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Tue, 18 Jul 2017 22:22:28 +0530 Subject: Better Regex and exception handling for this small code In-Reply-To: <20170711233640.GA22219@cskk.homeip.net> References: <20170711233640.GA22219@cskk.homeip.net> Message-ID: Thanks Cameron Simpson for you suggestion and reply quite helpful :) On Wed, Jul 12, 2017 at 5:06 AM, Cameron Simpson wrote: > On 11Jul2017 22:01, Ganesh Pal wrote: > >> I am trying to open a file and check if there is a pattern has changed >> after the task got completed? >> >> file data: >> ........................................................ >> >> #tail -f /file.txt >> .......................................... >> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = b6b20a65, >> journal_crc = d2097b00 >> Note: Task completed successfully. >> Note: CRC:algo = 2, split_crc = 1, unused = 0, initiator_crc = d976d35e, >> journal_crc = a176af10 >> >> >> I have the below piece of code but would like to make this better more >> pythonic , I found regex pattern and exception handling poor here , any >> quick suggestion in your spare time is welcome. >> >> >> #open the existing file if the flag is set and check if there is a match >> >> log_file='/file.txt' >> flag_is_on=1 >> > > Use "True" instead of "1". A flag is a Boolean thing, and should use a > Boolean value. This lets you literally speak "true" and 'false" rather than > imoplicitly saying that "0 means false and nonzero means true". > > data = None >> > > There is no need to initialise data here because you immediately overwrite > it below. > > with open(log_file, 'r') as f: >> data = f.readlines() >> >> if flag_is_on: >> > > Oh yes. Just name this variable "flag". "_is_on" is kind of implicit. > > logdata = '\n'.join(data) >> > > Do other parts of your programme deal with the file data as lines? If not, > there is little point to reading the file and breaking it up into lines > above, then joining them together against here. Just go: > > with open(log_file) as f: > log_data = f.read() > > reg = "initiator_crc =(?P[\s\S]*?), journal_crc" >> > > Normally we write regular expressions as "raw" python strings, thus: > > reg = r'initiator_crc =(?P[\s\S]*?), journal_crc' > > because backslashes etc are punctuation inside normal strings. Within a > "raw" string started with r' nothing is special until the closing ' > character. This makes writing regular expressions more reliable. > > Also, why the character range "[\s\S]"? That says whitespace or > nonwhitespace i.e. any character. If you want any character, just say ".". > > crc = re.findall(re.compile(reg), logdata) >> > > It is better to compile a regexp just the once, getting a Regexp object, > and then you just use the compiled object. > > if not crc: >> raise Exception("Pattern not found in logfile") >> > > ValueError would be a more appropriate exception here; plain old > "Exception" is pretty vague. > > checksumbefore = crc[0].strip() >> checksumafter = crc[1].strip() >> > > Your regexp cannot start or end with whitespace. Those .strip calls are > not doing anything for you. > > This reads like you expect there to be exactly 2 matches in the file. What > if there are more or fewer? > > logging.info("checksumbefore :%s and checksumafter:%s" >> % (checksumbefore, checksumafter)) >> >> if checksumbefore == checksumafter: >> raise Exception("checksum not macthing") >> > > Don't you mean != here? > > I wouldn't be raising exceptions in this code. Personally I would make > this a function that returns True or False. Exceptions are a poor way of > returning "status" or other values. They're really for "things that should > not have happened", hence their name. > > It looks like you're scanning a log file for multiple lines and wanting to > know if successive ones change. Why not write a function like this > (untested): > > RE_CRC_LINE = re.compile(r'initiator_crc =(?P[\s\S]*?), > journal_crc') > > def check_for_crc_changes(logfile): > old_crc_text = '' > with open(logfile) as f: > for line in f: > m = RE_CRC_LINE.match(line) > if not m: > # uninteresting line > continue > crc_text = m.group(0) > if crc_text != old_crc_text: > # found a change > return True > if old_crc_text == '': > # if this is really an error, you might raise this exception > # but maybe no such lines is just normal but boring > raise ValueError("no CRC lines seen in logfile %r" % (logfile,)) > # found no changes > return False > > See that there is very little sanity checking. In an exception supporting > language like Python you can often write code as if it will always succeed > by using things which will raise exceptions if things go wrong. Then > _outside_ the function you can catch any exceptions that occur (such as > being unable to open the log file). > > Cheers, > Cameron Simpson > From ganesh1pal at gmail.com Tue Jul 18 12:56:21 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Tue, 18 Jul 2017 22:26:21 +0530 Subject: Best way to assert unit test cases with many conditions Message-ID: Hi Dear Python Friends, The unittest?s TestCase class provides several assert methods to check for and report failures . I need suggestion what would the best way to assert test cases in the below piece of code. (1) should I add several asserts per test case, or just warn with the error and fail at the end . In the line 33 ? 35 / 37-38 ( sorry this is a dirty pusedo-code) . (2) Is there a way we can warn the test using assert method and not fail? I was trying to see if I could use assertWarns but the help says that ?The test passes if warning is triggered and fails if it isn?t ?. I don?t want to fail on warning but just continue which next checks (3) All more ways to optimize the sample code. 1 import unittest 2 import library 3 4 5 class AutoRepairFilesystem(unittest.TestCase): 6 7 blocks = {} 8 report = "" 9 10 @classmethod 11 def setUpClass(self): 12 """ 13 Set UP 14 """ 15 logging.info("SETUP.....Started") 16 try: 17 self.blocks['test01'] = library.inject_corruption1(file1) 18 self.blocks['test100'] = library.inject_corruption100(file100) 19 20 except Exception as e: 21 logging.error("Failure injection failed \n") 22 raise 23 24 if not library.check_Repair(): 25 logging.error("Failed running FSCK Tool ") 26 assert False, "Pre-test checks in setUpClass failed skipping test" 27 logging.info("SETUP.....Done") 28 29 def test_corruption1(self): 30 """Run test no 1 """ 31 # This was the only earlier condition then! 32 #self.assertTrue(library.log_message_is_reported(self.report,self.blocks['test01']):''' 33 if not library.log_message_is_reported(self.report, 34 self.blocks['test01']): 35 print "Warning: Reporting Failed.... \n" 36 37 if not library.is_corruption_fixed(): 38 print "Warning: Corruption is not fixed .... \n" 39 40 if not library.is_corruption_reparied(): 41 assert False, "Corruption not reported,fixed and auto repaired.\n" 42 43 def test_corruption100(self): 44 """ Run test no 100 """ 45 if not library.log_message_is_reported(self.report, 46 self.blocks['test100']): 47 print "Warning: Reporting Failed.... \n" 48 49 if not library.is_corruption_fixed(): 50 print "Warning: Corruption is not fixed .... \n" 51 52 if not library.is_corruption_reparied(): 53 assert False, "Corruption not reported,fixed and auto repaired.\n" 54 55 @classmethod 56 def tearDownClass(self): 57 """ Delete all files """ 58 os.system("rm -rf /tmp/files/") 59 60 if __name__ == '__main__': 61 unittest.main() I am a Linux user with Python 2.7. Regards, Ganesh From marko at pacujo.net Tue Jul 18 13:01:10 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Tue, 18 Jul 2017 20:01:10 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> Message-ID: <87zic1u1bd.fsf@elektro.pacujo.net> Chris Angelico : > what you're more likely to want is "match the letter ?", and you don't > care whether it's represented as U+0061 U+0301 or as U+00E1. That's > where Unicode normalization comes in. Yes. Also, not every letter can be normalized to a single codepoint so NFC is not a way out. For example, re.match("^[q?]$", "q?") returns None regardless of normalization. Marko From marko at pacujo.net Tue Jul 18 13:02:55 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Tue, 18 Jul 2017 20:02:55 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> Message-ID: <87vampu18g.fsf@elektro.pacujo.net> Marko Rauhamaa : > * the final consonant of a single-syllable word is doubled only if the > consonant is "k", "l" or "s" ("kick", "kill", "kiss") ... or "f" ("stiff") or "z" ("buzz") Marko From rhodri at kynesim.co.uk Tue Jul 18 13:08:04 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Tue, 18 Jul 2017 18:08:04 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <8760epvijl.fsf@elektro.pacujo.net> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> Message-ID: On 18/07/17 17:03, Marko Rauhamaa wrote: > Random832: > >> As for double-v, a quick search through /usr/share/dict/words reveals >> "civvies", "divvy", "revved/revving", "savvy" and "skivvy", and >> various conjugations thereof. All following, more or less, the rule of >> using a double consonant after a short vowel in contexts where a >> single consonant would suggest the preceding vowel was long. > The single/double consonant rule is indeed an ancient Germanic spelling > principle. English makes several twists to the it: It's not so much a rule as a guideline... -- Rhodri James *-* Kynesim Ltd From grant.b.edwards at gmail.com Tue Jul 18 13:10:11 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Tue, 18 Jul 2017 17:10:11 +0000 (UTC) Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> Message-ID: On 2017-07-18, Anders Wegge Keller wrote: > P? Tue, 18 Jul 2017 23:59:33 +1000 > Chris Angelico skrev: >> On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano > > >>> (I don't think any native English words use a double-V or double-U, but >>> the possibility exists.) > >> vacuum. > > That's latin. If you want to play that game, there are no native English words that contain any of the letters A-Z either. It turns out they're all german or frisan or norse or french or whatever... -- Grant Edwards grant.b.edwards Yow! Can you MAIL a BEAN at CAKE? gmail.com From rgaddi at highlandtechnology.invalid Tue Jul 18 13:10:58 2017 From: rgaddi at highlandtechnology.invalid (Rob Gaddi) Date: Tue, 18 Jul 2017 10:10:58 -0700 Subject: Best way to assert unit test cases with many conditions In-Reply-To: References: Message-ID: On 07/18/2017 09:56 AM, Ganesh Pal wrote: > (1) should I add several asserts per test case, or just warn with the > error and fail at the end . In the line 33 ? 35 / 37-38 ( sorry this is a > dirty pusedo-code) . Yes. Just assert each thing as it needs asserting. > > (2) Is there a way we can warn the test using assert method and not fail? > I was trying to see if I could use assertWarns but the help says that > ?The test passes if warning is triggered and fails if it isn?t ?. > > I don?t want to fail on warning but just continue which next checks > You can, but you're just going to complicate your life. A "test" is a thing that passes (all) or fails (any). If you need it to keep going after a failure, what you have are two tests. There's nothing wrong with having a whole mess of test functions. If there's a lot of common code there you'd have to replicate, that's what setUp() is for. If there are several different flavors of common code you need, you can create a base TestCase subclass and then derive further subclasses from that. Do the things the way the tools want to do them. Unit testing is enough of a pain without trying to drive nails with the butt of a screwdriver. -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. From rosuav at gmail.com Tue Jul 18 13:21:51 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 03:21:51 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87zic1u1bd.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <87zic1u1bd.fsf@elektro.pacujo.net> Message-ID: On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote: > Chris Angelico : > >> what you're more likely to want is "match the letter ?", and you don't >> care whether it's represented as U+0061 U+0301 or as U+00E1. That's >> where Unicode normalization comes in. > > Yes. Also, not every letter can be normalized to a single codepoint so > NFC is not a way out. For example, > > re.match("^[q?]$", "q?") > > returns None regardless of normalization. In what language or context would you actually want to do this? ChrisA From D.Strohl at F5.com Tue Jul 18 13:32:30 2017 From: D.Strohl at F5.com (Dan Strohl) Date: Tue, 18 Jul 2017 17:32:30 +0000 Subject: Best way to assert unit test cases with many conditions In-Reply-To: References: Message-ID: <2837f5d4cbdf43e7a218fd53f620ab1a@F5.com> Ganesh; I'm not 100% sure what you are trying to do.. so let me throw out a few things I do and see if that helps... If you are trying to run a bunch of similar tests on something, changing only (or mostly) in the parameters passed, you can use self.subTest(). Like this: Def test_this(self): For i in range(10): with self.subTest('test number %s) % i): self.assertTrue(I <= 5) With the subTest() method, if anything within that subTest fails, it won't stop the process and will continue with the next step. If you are trying to run a single test at the end of your run to see if something messed something up (say, corrupted a file or something), you can, (at least with the default unittest) name your test something like test_zzz_do_this_at_end, and unless you have over-ridden how the tests are being handled (or are using a different testing environment), unittest should run it last (of the ones in that TestCase class). From: https://docs.python.org/2/library/unittest.html#organizing-test-code "Note that the order in which the various test cases will be run is determined by sorting the test function names with respect to the built-in ordering for strings." From marko at pacujo.net Tue Jul 18 14:31:21 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Tue, 18 Jul 2017 21:31:21 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <87zic1u1bd.fsf@elektro.pacujo.net> Message-ID: <87vampegw6.fsf@elektro.pacujo.net> Chris Angelico : > On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote: >> Yes. Also, not every letter can be normalized to a single codepoint so >> NFC is not a way out. For example, >> >> re.match("^[q?]$", "q?") >> >> returns None regardless of normalization. > > In what language or context would you actually want to do this? I could have picked more realistic examples: Classic Greek or Hebrew, for example. However, someone might actually use even "q?" in a real setting. First of all, it *is* a legal character. Secondly, people sometimes combine characters in an ad-hoc fashion. Thirdly, remember the case of Esperanto, which blessed the world with the letters ? ? ? ? ? ? Esperanto's venerable history finally awarded those characters a code-point status in Unicode. However, around the year 2000, it was still commonplace to use all sorts of tricks to type them on the Internet: ch gh hh jj sh u ^c ^g ^h ^j ^s ^u cx gx hx jx sx ux For all we know, someone somewhere might be cooking up a language that depends on "q?". Marko From rosuav at gmail.com Tue Jul 18 14:46:21 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 04:46:21 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87vampegw6.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <87zic1u1bd.fsf@elektro.pacujo.net> <87vampegw6.fsf@elektro.pacujo.net> Message-ID: On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote: >>> Yes. Also, not every letter can be normalized to a single codepoint so >>> NFC is not a way out. For example, >>> >>> re.match("^[q?]$", "q?") >>> >>> returns None regardless of normalization. >> >> In what language or context would you actually want to do this? > > I could have picked more realistic examples: Classic Greek or Hebrew, > for example. > > However, someone might actually use even "q?" in a real setting. First of > all, it *is* a legal character. Secondly, people sometimes combine > characters in an ad-hoc fashion. Thirdly, remember the case of > Esperanto, which blessed the world with the letters > > ? ? ? ? ? ? > > Esperanto's venerable history finally awarded those characters a > code-point status in Unicode. However, around the year 2000, it was > still commonplace to use all sorts of tricks to type them on the > Internet: > > ch gh hh jj sh u > > ^c ^g ^h ^j ^s ^u > > cx gx hx jx sx ux > > For all we know, someone somewhere might be cooking up a language that > depends on "q?". Sure. And if they do, they'll have to contend with the fact that it's going to be represented as multiple code units. What I *think* you're asking for is for square brackets in a regex to count combining characters with their preceding base character. That would make a lot of sense, and would actually be a reasonable feature to request. (Probably as an option, in case there's a backward compatibility issue.) ChrisA From marko at pacujo.net Tue Jul 18 14:56:06 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Tue, 18 Jul 2017 21:56:06 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <87zic1u1bd.fsf@elektro.pacujo.net> <87vampegw6.fsf@elektro.pacujo.net> Message-ID: <87r2xdefqx.fsf@elektro.pacujo.net> Chris Angelico : > On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa wrote: >> Chris Angelico : >> >>> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote: >>>> Yes. Also, not every letter can be normalized to a single codepoint so >>>> NFC is not a way out. For example, >>>> >>>> re.match("^[q?]$", "q?") >>>> >>>> returns None regardless of normalization. > [...] > > What I *think* you're asking for is for square brackets in a regex to > count combining characters with their preceding base character. Yes. My example tries to match a single character against a single character. > That would make a lot of sense, and would actually be a reasonable > feature to request. (Probably as an option, in case there's a backward > compatibility issue.) There's the flag re.IGNORECASE. In the same vein, it might be useful to have re.IGNOREDIACRITICS, which would match re.match("^[abc]$", "?", re.IGNOREDIACRITICS) regardless of normalization. Marko From rosuav at gmail.com Tue Jul 18 15:32:09 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 05:32:09 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87r2xdefqx.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <87zic1u1bd.fsf@elektro.pacujo.net> <87vampegw6.fsf@elektro.pacujo.net> <87r2xdefqx.fsf@elektro.pacujo.net> Message-ID: On Wed, Jul 19, 2017 at 4:56 AM, Marko Rauhamaa wrote: > Chris Angelico : >> What I *think* you're asking for is for square brackets in a regex to >> count combining characters with their preceding base character. > > Yes. My example tries to match a single character against a single > character. > >> That would make a lot of sense, and would actually be a reasonable >> feature to request. (Probably as an option, in case there's a backward >> compatibility issue.) > > There's the flag re.IGNORECASE. In the same vein, it might be useful to > have re.IGNOREDIACRITICS, which would match > > re.match("^[abc]$", "?", re.IGNOREDIACRITICS) > > regardless of normalization. That's a different feature, and can be achieved with a different normalization: def fold(s): """Fold a string for 'search compatibility'. Returns a modified version of s with no diacriticals. """ s = s.casefold() s = unicodedata.normalize("NFKD", s) s = ''.join(c for c in s if c < '\u0300' or c > '\u033f') return unicodedata.normalize("NFKC", s) This is something that you might use when searching, as people will expect to be able to type "cafe" to fine "caf?". It is deliberately lossy. But having the re module group code units into logical characters according to 'base + combining' is a different feature. It may be worth adding. I don't think your re.IGNOREDIACRITICS is something that belongs in the stdlib, as different search contexts require different folding (Google, for instance, will find "?" when you search for "i" - but then, Google also finds "python" when you search for "phyton"). ChrisA From sonnichs at gmail.com Tue Jul 18 15:53:59 2017 From: sonnichs at gmail.com (FS) Date: Tue, 18 Jul 2017 12:53:59 -0700 (PDT) Subject: pyserial and end-of-line specification In-Reply-To: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> References: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> Message-ID: <4410bbc7-a57a-4b75-9f62-eb15df7e92b5@googlegroups.com> Thank you for your response Andre. I had tried some code like that in the document but it did not seem to work. However ever leaving my terminal for a time the code eventually wrote out the records so apparently there is some very deep buffering going on here. A little more searching on the web revealed the following: https://stackoverflow.com/questions/10222788/line-buffered-serial-input It is apparent that pySerial, or at least the documentation is falling short of my needs. It is very unclear what module in the layer is handling the buffering and newlines and so forth. Also unclear is whether the coupled python and OS is reading FIFO or LIFO--something important in quasi realtime scientific applications. This is problematic since the serial port is still so ubiquitous to a lot of scientific instrumentation. I probably will patch up some byte oriented code for this or perhaps write the module in C. Thanks again Fritz From mikhailwas at gmail.com Tue Jul 18 16:05:17 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Tue, 18 Jul 2017 22:05:17 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: On 2017-07-18, Steve D'Aprano wrote: > That's neither better nor worse than the system used by English and French, > where letters with dicritics are not distinct letters, but guides to > pronunciation. >_Neither system is right or wrong, or better than the other._ If that is said just "not to hurt anybody" then its ok. Though this statement is pretty absurd, not so many (intelligent) people will buy this out today. Mikhail From rosuav at gmail.com Tue Jul 18 16:51:42 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 06:51:42 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V wrote: > On 2017-07-18, Steve D'Aprano wrote: > >> That's neither better nor worse than the system used by English and French, >> where letters with dicritics are not distinct letters, but guides to >> pronunciation. > >>_Neither system is right or wrong, or better than the other._ > > > If that is said just "not to hurt anybody" then its ok. > Though this statement is pretty absurd, not so many > (intelligent) people will buy this out today. Let me give you one concrete example: the letter "?". In English, it is (very occasionally) used to indicate diaeresis, where a pair of letters is not a double letter - for example, "co?perate". (You can also hyphenate, "co-operate".) In German, it is the letter "o" with a pronunciation mark (umlaut), and is considered the same letter as "o". In Swedish, it is a distinct letter, alphabetized last (following z, ?, and ?, in that order). But in all these languages, it's represented the exact same way. Steven is pointing out that there's nothing fundamentally wrong about using "?" as a unique letter, nor is there anything fundamentally wrong about using it as "o" with a pronunciation mark. Which I agree with. ChrisA From marko at pacujo.net Tue Jul 18 17:29:55 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 00:29:55 +0300 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <87h8y9e8mk.fsf@elektro.pacujo.net> Chris Angelico : > Let me give you one concrete example: the letter "?". In English, it > is (very occasionally) used to indicate diaeresis, where a pair of > letters is not a double letter - for example, "co?perate". (You can > also hyphenate, "co-operate".) In German, it is the letter "o" with a > pronunciation mark (umlaut), and is considered the same letter as "o". > In Swedish, it is a distinct letter, alphabetized last (following z, > ?, and ?, in that order). But in all these languages, it's represented > the exact same way. The German Wikipedia entry on "?" calls "?" a letter ("Buchstabe"): Der Buchstabe ? (kleingeschrieben ?) ist ein Buchstabe des lateinischen Schriftsystems. Furthermore, it makes a distinction between "?" the letter and "?" the "a with a diaeresis:" In guten Druckschriften unterscheiden sich die Umlautpunkte von den zwei Punkten des Tremas: Die Umlautpunkte sind kleiner, stehen n?her zusammen und liegen etwas tiefer. In good fonts umlaut dots are different from the two dots of a diaeresis: the umlaut dots are smaller and closer to each other and lie a little lower. [translation mine] (My native Finnish has the "?" as well; the German tradition of placing the dots next to the body of the "a" looks a bit unpleasant. On the other hand, so does the English tradition of hanging the dots high up in the air.) Marko From greg.ewing at canterbury.ac.nz Tue Jul 18 18:39:18 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 10:39:18 +1200 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596e045b$0$1595$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> <596e045b$0$1595$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > "source_" is already a public name, which means that users could want to create > fields with that name for some reason, They could equally well want to define their own private field called "_source". IMO a better thing to do would have been to name it "__source__". Dunder names are officially reserved for use by the language or stdlib. -- Greg From greg.ewing at canterbury.ac.nz Tue Jul 18 19:13:50 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 11:13:50 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: Steve D'Aprano wrote: > (I don't think any native English words use a double-V or double-U, but the > possibility exists.) vacuum savvy (Vacuum is arguably Latin, but we've been using it for long enough that it's at least as English as most of the other words we use.) -- Greg From greg.ewing at canterbury.ac.nz Tue Jul 18 19:21:28 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 11:21:28 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> Message-ID: Random832 wrote: > What about Emoji? > U+1F469 WOMAN is two columns wide on its own. > U+1F4BB PERSONAL COMPUTER is two columns wide on its own. The term "emoji" is becoming rather strained these days. The idea of "woman" and "personal computer" being emotions is an interesting one... -- Greg From mikhailwas at gmail.com Tue Jul 18 19:43:26 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 19 Jul 2017 01:43:26 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: Marko Rauhamaa wrote: >What did you think of my concrete examples, then? (Say, finding >"Alv?rez" with the regular expression "Alv[a?]rez".) I think that should match both "Alvarez" and "Alv?rez" ...? But firstly, I feel like I need to _guess_ what ideas you are presenting. Unless I open up Vim and apply my imagination, it is hard even to get involved in your ideas. I wonder why it is hard to elaborate a pair of examples like e.g. : - now the task A (concrete task defined) is solved with the code C1 - with the new syntax/method, the same task could be solved with the code C2 Just trying to guess related tasks: For the automation of regex search-related tasks I would make a function which generates the RE pattern first, i.e. define tables with "variations" for glyphs, e.g. groups={"a": "a?"} or similar. Then I'll need some micro-syntax for the conversion, e.g. generate_re("Alv{a}rez", groups) Intuitively, I suppose the groupings and even the functions hardly can be standardized in a nice manner, since I'll need to define and redefine them all the time for various cases. But probably there can be some generality, hard to say. What I need often is the "approximate" search function, which returns a match "similar" to the input string. But I think even the regex module cannot fully solve this and I would end up with a function which goes through each string element and calculate various similarity criteria. Mikhail From ben+python at benfinney.id.au Tue Jul 18 20:08:17 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 19 Jul 2017 10:08:17 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> Message-ID: <85zic15lvy.fsf@benfinney.id.au> Gregory Ewing writes: > The term "emoji" is becoming rather strained these days. > The idea of "woman" and "personal computer" being emotions > is an interesting one... I think of ?emoji? as ?not actually a character in any system anyone would use for writing anything, but somehow gets to squat in the Unicode space?. -- \ ?The priesthood have, in all ancient nations, nearly | `\ monopolized learning.? ?John Adams, _Letters to John Taylor_, | _o__) 1814 | Ben Finney From mikhailwas at gmail.com Tue Jul 18 20:34:45 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 19 Jul 2017 02:34:45 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: ChrisA wrote: >On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V wrote: >> On 2017-07-18, Steve D'Aprano wrote: >> >>> That's neither better nor worse than the system used by English and French, >>> where letters with dicritics are not distinct letters, but guides to >>> pronunciation. >> >>>_Neither system is right or wrong, or better than the other._ >> >> >> If that is said just "not to hurt anybody" then its ok. >> Though this statement is pretty absurd, not so many >> (intelligent) people will buy this out today. >Let me give you one concrete example: the letter "?". In English, it >is (very occasionally) used to indicate diaeresis, where a pair of >letters is not a double letter - for example, "co?perate". (You can >also hyphenate, "co-operate".) In German, it is the letter "o" with a >pronunciation mark (umlaut), and is considered the same letter as "o". >In Swedish, it is a distinct letter, alphabetized last (following z, >?, and ?, in that order). But in all these languages, it's represented >the exact same way. > >Steven is pointing out that there's nothing fundamentally wrong about >using "?" as a unique letter, nor is there anything fundamentally >wrong about using it as "o" with a pronunciation mark. Which I agree >with. > Ok, in this narrow context I can also agree. But in slightly wider context that phrase may sound almost like: "neither geometrical shape is better than the other as a basis for a wheel. If you have polygonal wheels, they are still called wheels." Mikhail From rosuav at gmail.com Tue Jul 18 21:15:33 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 11:15:33 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 10:34 AM, Mikhail V wrote: > Ok, in this narrow context I can also agree. > But in slightly wider context that phrase may sound almost like: > "neither geometrical shape is better than the other as a basis > for a wheel. If you have polygonal wheels, they are still called wheels." I don't think he meant that. (Anyway, what shape IS a .whl file?) ChrisA From steve+python at pearwood.info Tue Jul 18 22:07:56 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 12:07:56 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> Message-ID: <596ebefd$0$1614$c3e8da3$5496439d@news.astraweb.com> On Wed, 19 Jul 2017 12:09 am, Random832 wrote: > On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote: >> What do you mean about regular expressions? You can use REs with >> normalized strings. And if you have any valid definition of "real >> character", you can use it equally on an NFC-normalized or >> NFD-normalized string than any other. They're just strings, you know. > > I don't understand how normalization is supposed to help with this. It's > not like there aren't valid combinations that do not have a > corresponding single NFC codepoint (to say nothing of the situation with > e.g. Indic languages). Normalisation helps. Suppose you want to search for ? for example, a naive regular expression engine will only find the exact representation you or your editor happened to use: U+00E9 LATIN SMALL LETTER E WITH ACUTE or U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT but not both. By normalising, you ensure that both the text you are searching and the regex you are searching for are in the same state: either composed to a single code point U+00E9 or decomposed to two U+0065,0301 but never one in one state and the other in the other. For characters that don't include a canonical composition form, then there's no problem: you will always be searching for a decomposed character using a base character followed by combining characters, so there is no discrepancy and it will just work. > In principle probably a viable solution for regex would be to add > character classes for base and combining characters, and then > "[[:base:]][[:combining:]]*" can be used as a building block if > necessary. I don't know what that means. Any code point (except for combining characters themselves) can be used as the base, and the various kinds of combining characters have the Unicode category property: Mn (Mark, nonspacing) Mc (Mark, spacing combining) Me (Mark, enclosing) If we're talking about combining accents and diacritics, the one we want is Mc. But generally, we're not after "any old diacritic", we're after a specific one, on a specific base. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rustompmody at gmail.com Tue Jul 18 22:19:19 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Tue, 18 Jul 2017 19:19:19 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87h8y9e8mk.fsf@elektro.pacujo.net> References: <87h8y9e8mk.fsf@elektro.pacujo.net> Message-ID: On Wednesday, July 19, 2017 at 3:00:21 AM UTC+5:30, Marko Rauhamaa wrote: > Chris Angelico : > > > Let me give you one concrete example: the letter "?". In English, it > > is (very occasionally) used to indicate diaeresis, where a pair of > > letters is not a double letter - for example, "co?perate". (You can > > also hyphenate, "co-operate".) In German, it is the letter "o" with a > > pronunciation mark (umlaut), and is considered the same letter as "o". > > In Swedish, it is a distinct letter, alphabetized last (following z, > > ?, and ?, in that order). But in all these languages, it's represented > > the exact same way. > > The German Wikipedia entry on "?" calls "?" a letter ("Buchstabe"): > > Der Buchstabe ? (kleingeschrieben ?) ist ein Buchstabe des > lateinischen Schriftsystems. > > Furthermore, it makes a distinction between "?" the letter and "?" the > "a with a diaeresis:" > > In guten Druckschriften unterscheiden sich die Umlautpunkte von den > zwei Punkten des Tremas: Die Umlautpunkte sind kleiner, stehen n?her > zusammen und liegen etwas tiefer. > > In good fonts umlaut dots are different from the two dots of a > diaeresis: the umlaut dots are smaller and closer to each other and > lie a little lower. [translation mine] > Very interesting! And may I take it that the two different variants ? u-umlaut and u-diaresis ? of ? are not (yet) given a seat in unicode? Now compare with: - hyphen-minus 0x2D ? minus sign 0x2212 ? hyphen 0x2010 ? en dash 0x2013 ? em dash 0x2014 ? horizontal bar 0x2015 ? And perhaps another half-dozen From steve+python at pearwood.info Tue Jul 18 22:46:42 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 12:46:42 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: <596ec812$0$1610$c3e8da3$5496439d@news.astraweb.com> On Tue, 18 Jul 2017 11:59 pm, Chris Angelico wrote: >> (I don't think any native English words use a double-V or double-U, but the >> possibility exists.) > > vacuum. Nice. Also continuum and residuum. For double V, we have savvy, skivvy, flivver (an old slang term for cars). -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 22:49:20 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 12:49:20 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> Message-ID: <596ec8b1$0$1596$c3e8da3$5496439d@news.astraweb.com> On Wed, 19 Jul 2017 12:29 am, Random832 wrote: > On Sun, Jul 16, 2017, at 01:37, Steven D'Aprano wrote: >> In a *well-designed* *bug-free* monospaced font, all code points should >> be either zero-width or one column wide. Or two columns, if the font >> supports East Asian fullwidth characters. > > What about Emoji? > U+1F469 WOMAN is two columns wide on its own. > U+1F4BB PERSONAL COMPUTER is two columns wide on its own. > U+200D ZERO WIDTH JOINER is zero columns wide on its own. What about them? In a monospaced font, they should follow the same rules I used above: either 0, 1 or 2 column wide. If any visible code point is a fraction of a column wide, it isn't usable as a monospaced font. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 23:04:44 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 13:04:44 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> Message-ID: <596ecc4d$0$1595$c3e8da3$5496439d@news.astraweb.com> On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote: > On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote: >> On 17/07/17 05:10, Rustom Mody wrote: >> > Hint1: Ask your grandmother whether unicode's notion of character makes >> > sense. Ask 10 gmas from 10 language-L's >> > Hint2: When in doubt gma usually is right >> >> "For every complex problem there is an answer that is clear, simple and >> wrong." (H.L. Mencken). > > Great men galore with great quotes galore? > Here are 3 ? take your pick: > > Einstein: > If you can't explain something to a six-year-old, you really don't understand > it yourself. > > [Commonly attributed to Einstein > More likely Feynman, Rutherford, de Broglie or some other notable physicist > https://skeptics.stackexchange.com/questions/8742/did-einstein-say-if-you-cant-explain-it-simply-you-dont-understand-it-well-en > ] More likely none of the above, but invented by some non-expert who wanted to put down the value of expert knowledge, and thought a bogus argument by authority was the best way to do it. (Einstein said it, therefore it must be right!) Think about it: it simply is nonsense. If this six year old test was valid, that would imply that all fields of knowledge are capable of being taught to the average six year old. Yeah good luck with that. But even if we accept this, it doesn't contradict the Mencken quote. I can explain the birds and the bees to a six year, at a level that they will understand. That doesn't mean that (1) I am an expert on human reproduction; or that (2) people should ask the six year old for advice about human reproduction. The second part is the problem. I understand how cars work, to an acceptable degree that I could probably explain it to a six year old. But if you came to me to ask my advice about buying a car, or repairing a car, you'll get bad advice. I'm not an expert and I don't know enough to give *good* advice. Same with your "grandmother" test. Yes, I'm sure that most "grandmothers" (I know that's just shorthand for "regular people who aren't experts") will have an intuitive idea of what a character is. But what on earth makes you think that intuitive idea is both *necessary and sufficient* for programming? > Dijkstra: > > Programming languages belong to the problem set, not (as some imagine) > to the solution set > https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD473.html Relevance? That's just the "Now you have two problems" observation, reworded for programming languages in general rather than just regular expressions. How is it relevant? It seems to me that you are just tossing random quotes out in the hope that some of them might stick. Two can play at that game: "He who questions training only trains himself at asking questions." - The Sphinx "Must ? defy ? laws ? of ? physics ?" - The Tick "Whenever Giles sends me on a mission, he always says 'please'. And afterwards I get a cookie." - Buffy the Vampire Slayer The bottom line is, your "grandma" test dismisses the value of expert domain knowledge. As programmers, we need access to expert domain knowledge, even if we don't hold it ourselves, we need to trust that the people who wrote our libraries had it. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 23:12:00 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 13:12:00 +1000 Subject: Users of namedtuple: do you use the _source attribute? References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> <596e045b$0$1595$c3e8da3$5496439d@news.astraweb.com> Message-ID: <596ece02$0$1595$c3e8da3$5496439d@news.astraweb.com> On Wed, 19 Jul 2017 08:39 am, Gregory Ewing wrote: > Steve D'Aprano wrote: >> "source_" is already a public name, which means that users could want to >> create fields with that name for some reason, > > They could equally well want to define their own private > field called "_source". Um... well, people want to do all sorts of wild and wacky things... but why would you define a named tuple with *private* fields? Especially since that privateness isn't enforced when you access the items by position. In any case, the namedtuple API prohibits that, so it isn't an option. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 23:22:41 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 13:22:41 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> <85zic15lvy.fsf@benfinney.id.au> Message-ID: <596ed081$0$1610$c3e8da3$5496439d@news.astraweb.com> On Wed, 19 Jul 2017 10:08 am, Ben Finney wrote: > Gregory Ewing writes: > >> The term "emoji" is becoming rather strained these days. >> The idea of "woman" and "personal computer" being emotions >> is an interesting one... > > I think of ?emoji? as ?not actually a character in any system anyone > would use for writing anything, but somehow gets to squat in the Unicode > space?. Blame the Japanese mobile phone manufacturers. They want to include emoji in their SMSes and phone chat software, and have the money to become full members of the Unicode Consortium. I suppose that having a standard for emoji is good. I'm not convinced that Unicode should be that standard, but on the other hand if we agree that Unicode should support hieroglyphics and pictographs, well, that's exactly what emoji are. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 23:32:26 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 13:32:26 +1000 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <596ed2cc$0$1589$c3e8da3$5496439d@news.astraweb.com> On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote: > Ok, in this narrow context I can also agree. > But in slightly wider context that phrase may sound almost like: > "neither geometrical shape is better than the other as a basis > for a wheel. If you have polygonal wheels, they are still called wheels." I'm not talking about wheels, I'm talking about writing systems which are fundamentally collections of arbitrary shapes. There's nothing about the sound of "f" that looks like the letter "f". But since you mentioned non-circular wheels, such things do exist, and are still called "wheels" (or "gears", which is a kind of specialised wheel). https://eric.ed.gov/?id=EJ937593 https://en.wikipedia.org/wiki/Non-circular_gear https://en.wikipedia.org/wiki/Square_wheel https://www.youtube.com/watch?v=vk7s4PfvCZg -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Tue Jul 18 23:57:52 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 19 Jul 2017 13:57:52 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <1051542e-32ac-17b5-0081-ed5c4ff9e07f@mrabarnett.plus.com> <854lud861l.fsf@benfinney.id.au> <596afd45$0$11093$c3e8da3@news.astraweb.com> <85vams6yjh.fsf@benfinney.id.au> Message-ID: <596ed8c2$0$1593$c3e8da3$5496439d@news.astraweb.com> On Mon, 17 Jul 2017 04:12 am, Ben Finney wrote: > Steven D'Aprano writes: > >> On Sun, 16 Jul 2017 12:33:10 +1000, Ben Finney wrote: >> >> > And yet the ASCII and Unicode standard says code point 0x0A (U+000A >> > LINE FEED) is a character, by definition. >> [...] >> > > Is an acute accent a character? >> > >> > Yes, according to Unicode. ??? (U+0301 ACUTE ACCENT) is a character. >> >> Do you have references for those claims? > > The Unicode Standard > frequently uses ?character? as the unit of semantic value that Unicode > deals in. See the ?Contents? table for many references. > > In ?2.2 under the sub-heading ?Characters, Not Glyphs? it defines the > term, and thereafter uses ?character? in a way that includes all such > units, even formatting codes. Thanks for that. TIL something new. I'm not sure whether I had misunderstood, or whether the standard has changed, but I recall them previously being very reticent about giving a formal definition for the term character. (Or possibly a combination of both.) Even now, they do seem to prefer to use "character" in the sense of an abstract character, not necessarily something that ordinary users of language will recognise as a character or letter. E.g. they include control codes, variation codes, diacritic marks on their own with no base, and more. Unicode defines exactly 66 noncharacters: http://www.unicode.org/faq/private_use.html#noncharacters I found the table on page 30 here: http://www.unicode.org/versions/Unicode10.0.0/ch02.pdf#G25564 very useful. That helped to clarify my thinking. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From greg.ewing at canterbury.ac.nz Wed Jul 19 01:51:49 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 17:51:49 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> Message-ID: Chris Angelico wrote: > Once you NFC or NFD normalize both strings, identical strings will > generally have identical codepoints... You should then be able to use normal regular expressions to > match correctly. Except that if you want to match a set of characters, you can't reliably use [...], you would have to write them out as alternatives in case some of them take up more than one code point. -- Greg From tomuxiong at gmx.com Wed Jul 19 02:37:01 2017 From: tomuxiong at gmx.com (Thomas Nyberg) Date: Wed, 19 Jul 2017 08:37:01 +0200 Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: <596ece02$0$1595$c3e8da3$5496439d@news.astraweb.com> References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> <596e045b$0$1595$c3e8da3$5496439d@news.astraweb.com> <596ece02$0$1595$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 07/19/2017 05:12 AM, Steve D'Aprano wrote: > On Wed, 19 Jul 2017 08:39 am, Gregory Ewing wrote: > Um... well, people want to do all sorts of wild and wacky things... but why > would you define a named tuple with *private* fields? Especially since that > privateness isn't enforced when you access the items by position. Maybe the user wants to match a naming convention that already exists? I am doing this in code I'm writing at the moment. I'm not using namedtuples, but if I were it would be nice if I could match the conventions from earlier. > In any case, the namedtuple API prohibits that, so it isn't an option. Of course the API could have been different. I'm not saying I think that private fields should be allowed, but there certainly are valid use cases. From steve at pearwood.info Wed Jul 19 02:49:13 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 19 Jul 2017 06:49:13 GMT Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> Message-ID: <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> On Wed, 19 Jul 2017 17:51:49 +1200, Gregory Ewing wrote: > Chris Angelico wrote: >> Once you NFC or NFD normalize both strings, identical strings will >> generally have identical codepoints... You should then be able to use >> normal regular expressions to match correctly. > > Except that if you want to match a set of characters, > you can't reliably use [...], you would have to write them out as > alternatives in case some of them take up more than one code point. Good point! A quibble -- there's no "in case" here, since you, the programmer, will always know whether they have a single code point form or not. If you're unsure, look it up, or call unicodedata.normalize(). (Yeah, right, like the average coder will remember to do this...) Nevertheless, although it might be annoying and tricky, regexes *are* flexible enough to deal with this problem. After all, you can't use [th] to match "th" as a unit either, and regex set character set notation [abcd] is logically equivalent to (a|b|c|d). I wonder how Perl 6 has solved this problem? They seem to be much more advanced when it comes to dealing with Unicode. The *really* tricky part is if you receive a string from the user intended as a regular expression. If they provide [xyz?] as part of a regex, and you receive ? in denormalized form U+0061 LATIN SMALL LETTER A + U+0303 COMBINING TILDE you can't be sure that they actually intended: U+00E3 LATIN SMALL LETTER A WITH TILDE maybe they're smarter than you think and they actually do mean [xyza\N{COMBINING TILDE}] = (x|y|z|a|\N{COMBINING TILDE}) -- Steve From greg.ewing at canterbury.ac.nz Wed Jul 19 03:56:49 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 19:56:49 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: Grant Edwards wrote: > vacuum, continuum, squush, fortuuned Fortuuned? Where did you find that? Google gives me a bizarre set of results, none of which appear to be an English dictionary definition. -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 19 03:57:03 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 19:57:03 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87vampu18g.fsf@elektro.pacujo.net> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> <87vampu18g.fsf@elektro.pacujo.net> Message-ID: Marko Rauhamaa wrote: >> * the final consonant of a single-syllable word is doubled only if the >> consonant is "k", "l" or "s" ("kick", "kill", "kiss") > > ... or "f" ("stiff") or "z" ("buzz") or sometimes "r" ("burr"), or "t" ("butt"). -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 19 03:57:07 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 19:57:07 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <8760epvijl.fsf@elektro.pacujo.net> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> Message-ID: Marko Rauhamaa wrote: > > * "v" is never doubled ("shovel") Except for all the words that Grant listed before. > > * a final "v" receives a superfluous "e" ("love") It's not superfluous there, it's preventing "love" from looking like it should rhyme with "of". (Of course you just have to know that it also doesn't rhyme with "hove".) > * a final consonant is never doubled in a multisyllable word > ("havoc", "shovel") Sometimes it is ("recall", "refill"). -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 19 03:57:13 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Wed, 19 Jul 2017 19:57:13 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87vampegw6.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <87zic1u1bd.fsf@elektro.pacujo.net> <87vampegw6.fsf@elektro.pacujo.net> Message-ID: Marko Rauhamaa wrote: > For all we know, someone somewhere might be cooking up a language that > depends on "q?". It makes perfectly good sense to me. It's the second derivative of q with respect to time. -- Greg From ganesh1pal at gmail.com Wed Jul 19 03:57:22 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Wed, 19 Jul 2017 13:27:22 +0530 Subject: Best way to assert unit test cases with many conditions In-Reply-To: <2837f5d4cbdf43e7a218fd53f620ab1a@F5.com> References: <2837f5d4cbdf43e7a218fd53f620ab1a@F5.com> Message-ID: On Tue, Jul 18, 2017 at 11:02 PM, Dan Strohl wrote: > > Like this: > > Def test_this(self): > For i in range(10): > with self.subTest('test number %s) % i): > self.assertTrue(I <= 5) > > With the subTest() method, if anything within that subTest fails, it won't > stop the process and will continue with the next step. > > Thanks for reading my email and yes you got it right , I am adding bunch of same subtest and all are similar and sub test that change only differ in parameter. But I can?t use the loop that you have mentioned because I want to achieve (1) and (2) (1) I would want my subtest to have a *Condition* based on which it that would pass my entire test if any of the sub-test passed. Example : def test_this(self): if Sub_test_1(): #passes then PASS the Complete test i.e. test_this() and If sub_test_1() fail then run further subtest!) elif run sub_test_2() : #Then PASS test_this() and don't run next test i.e sub_test_3(),sub_test_4() etc) elif run sub_test_3() if sub_test_3() # Then pass test_this() and don't run next test i.e. sub_test_4() ,sub_test_5(). etc) def test_this_1(self): if Sub_test_1(): #passes then PASS the Complete test i.e. test_this1() and If sub_test_1() fail then run further subtest!) elif run sub_test_2() : #Then PASS test_this_1() and don't run next test i.e sub_test_3,sub_test_4 etc) elif run sub_test_3() if sub_test_3() # Then pass test_this_1() and don't run next test i.e. sub_test_4 ,sub_test_5.. etc) def test_this_2(self): if Sub_test_1(): #passes then PASS the Complete test i.e. test_this3() and If Sub_test_1() fail then run further subtest!) elif run sub_test_2() : #Then PASS test_this() and don't run next test i.e sub_test_3,sub_test_4 etc) elif run sub_test_3() if sub_test_3() # Then pass test_this() and don't run next test i.e. sub_test_4 ,sub_test_5.. etc) (1)In general don?t want to fail the test if any sub test fails but continue with next subtest and PASS the test if any one of fails (2) Also , I wanted to know if it?s ok Warn and continue , instead of failing , but I wanted to see if there is assert to warn 29 def test_corruption1(self): 30 """Run test no 1 """ 31 # 32 #self.assertTrue(library.log_message_is_reported(self.report,self.blocks['test01']):''' 33 if not library.log_message_is_reported(self.report, 34 self.blocks['test01']): 35 print "Warning: Reporting Failed.... \n" 36 37 if not library.is_corruption_fixed(): 38 print "Warning: Corruption is not fixed .... \n" 39 40 if not library.is_corruption_reparied(): 41 assert False, "Corruption not reported,fixed and auto repaired.\n" 42 Let me know if it isn't clear I can give you more examples , Thanks for responding Regards, Ganesh From rosuav at gmail.com Wed Jul 19 04:08:46 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 18:08:46 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: On Wed, Jul 19, 2017 at 4:49 PM, Steven D'Aprano wrote: > The *really* tricky part is if you receive a string from the user > intended as a regular expression. If they provide > > [xyz?] > > as part of a regex, and you receive ? in denormalized form > > U+0061 LATIN SMALL LETTER A + U+0303 COMBINING TILDE > > you can't be sure that they actually intended: > > U+00E3 LATIN SMALL LETTER A WITH TILDE > > maybe they're smarter than you think and they actually do mean > > [xyza\N{COMBINING TILDE}] = (x|y|z|a|\N{COMBINING TILDE}) To be quite honest, I wouldn't care about that possibility. If I could design regex semantics purely from an idealistic POV, I would say that [xyz?], regardless of its encoding, will match any of the four characters "x", "y", "z", "?". Earlier I posted a suggestion that a folding function be used when searching (for instance, it can case fold, NFKC normalize, etc). Unfortunately, this makes positional matching extremely tricky; if normalization changes the number of code points in the string, you have some fiddly work to do to try to find back the match location in the original (pre-folding) string. That technique works well for simple lookups (eg "find me all documents whose titles contain "), but a regex does more than that. As such, I am in favour of the regex engine defining a "character" as a base with all subsequent combining, so a single dot will match the entire combined character, and square bracketed expressions have the same meaning whether you're NFC or NFD normalized, or not normalized. However, that's the ideal situation, and I'm not sure (a) whether it's even practical to do that, and (b) how bad it would be in terms of backward compatibility. ChrisA From steve at pearwood.info Wed Jul 19 04:17:52 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 19 Jul 2017 08:17:52 GMT Subject: Grapheme clusters, a.k.a.real characters References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <75019890-592b-1609-a11b-3ce4b6fab52a@kynesim.co.uk> Message-ID: <596f15b0$0$2878$c3e8da3$76491128@news.astraweb.com> On Tue, 18 Jul 2017 16:37:37 +0100, Rhodri James wrote: > (For the record, one of my grandmothers would have been baffled by this > conversation, and the other one would have had definite opinions on > whether accents were distinct characters or not, followed by a > digression into whether "?" and "?" should be suppressed vigorously :-) Can I ask what nationality your grandmother was, given that she had an opinion on the suppression of ? and ?. And was she for it or against it? -- Steve From marko at pacujo.net Wed Jul 19 04:20:30 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 11:20:30 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> <87vampu18g.fsf@elektro.pacujo.net> Message-ID: <87bmogbzxt.fsf@elektro.pacujo.net> Gregory Ewing : > Marko Rauhamaa wrote: >>> * the final consonant of a single-syllable word is doubled only if the >>> consonant is "k", "l" or "s" ("kick", "kill", "kiss") >> >> ... or "f" ("stiff") or "z" ("buzz") > > or sometimes "r" ("burr"), or "t" ("butt"). Those are exceptions. The ortographic principles are being forgotten. Hence: rev - revved (instead of *reve - *reved) savvy (instead of *savy) fulfill (Am., instead of fulfil) wellbeing (instead of *welbeing) Marko From steve at pearwood.info Wed Jul 19 04:28:56 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 19 Jul 2017 08:28:56 GMT Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <1500387099.4041269.1044669328.2C241A9E@webmail.messagingengine.com> Message-ID: <596f1848$0$2878$c3e8da3$76491128@news.astraweb.com> On Tue, 18 Jul 2017 10:11:39 -0400, Random832 wrote: > On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote: >> Consider, for example, a Python source code >> editor where you want to limit the length of the line based on the >> number of characters more typically than based on the number of pixels. > > Even there you need to go based on the width in character cells. Most > characters for East Asian languages occupy two character cells. > > It would be nice if there was an easy way to get str.format to use this > width instead of the length in code points for the purpose of padding. You could always put in a feature request :-) Alternatively, you could propose yet another formatting function that groks the difference between narrow width and full width characters. The reason I suggest that is that I expect it will probably be easier to implement one yourself in pure Python, and add it to the string module, than to convince somebody else to modify str.format() which is in C. :-) -- Steve From marko at pacujo.net Wed Jul 19 04:29:58 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 11:29:58 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> Message-ID: <877ez4bzi1.fsf@elektro.pacujo.net> Gregory Ewing : > Marko Rauhamaa wrote: >> * a final "v" receives a superfluous "e" ("love") > > It's not superfluous there, it's preventing "love" from looking like > it should rhyme with "of". I'm pretty sure that wasn't the original motivation. If I had to guess, the reason was the possible visual confusion with "w". An interesting tidbit is that the English spelling demonstrates how the [o] sound regularly shifted into an [?] sound in front of nasals and "v": dove love hover cover shove above sponge come among front done son monk monkey Again, exceptions abound: on wrong song gone long Marko From marko at pacujo.net Wed Jul 19 05:53:14 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 12:53:14 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: <87379sbvn9.fsf@elektro.pacujo.net> Chris Angelico : > To be quite honest, I wouldn't care about that possibility. If I could > design regex semantics purely from an idealistic POV, I would say that > [xyz?], regardless of its encoding, will match any of the four > characters "x", "y", "z", "?". > > Earlier I posted a suggestion that a folding function be used when > searching (for instance, it can case fold, NFKC normalize, etc). > Unfortunately, this makes positional matching extremely tricky; if > normalization changes the number of code points in the string, you > have some fiddly work to do to try to find back the match location in > the original (pre-folding) string. That technique works well for > simple lookups (eg "find me all documents whose titles contain string>"), but a regex does more than that. As such, I am in favour of > the regex engine defining a "character" as a base with all subsequent > combining, so a single dot will match the entire combined character, > and square bracketed expressions have the same meaning whether you're > NFC or NFD normalized, or not normalized. However, that's the ideal > situation, and I'm not sure (a) whether it's even practical to do > that, and (b) how bad it would be in terms of backward compatibility. Here's a proposal: * introduce a building (predefined) class Text * conceptually, a Text object is a sequence of "real" characters * you can access each "real" character by its position in O(1) * the "real" character is defined to be a integer computed as follows (in pseudo-Python): string = the NFC normal form of the real character as a string rc = 0 shift = 0 for codepoint in string: rc |= ord(codepoing) << shift shift += 6 return rc * t[n] evaluates to an integer * the Text constructor takes a string or an integer * str(Text) evaluates to the NFC encoding of the Text object * Text.encode(...) works like str(Text).encode(...) * regular expressions work with Text objects * file system functions work with Text objects Instead of introducing Text, all of this could also be done within the str class itself: * conceptually, an str object is a sequence of integers representing Unicode code points *or* "real" characters * ord(s) returns the code point or the integer (rc) from the algorithm above * chr(n) takes a valid code point or an rc value as defined above * s.canonical() returns a string that has merged all multi-code-point characters into single "real" characters Each approach has its upsides and downsides. Marko From ganesh1pal at gmail.com Wed Jul 19 06:26:32 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Wed, 19 Jul 2017 15:56:32 +0530 Subject: Best way to assert unit test cases with many conditions In-Reply-To: References: Message-ID: > > Yes. Just assert each thing as it needs asserting. > > Asserting each sub test will fail the entire test, I want the to pass the test if any the sub test passes. If the sub test fail try all cases and fail for the last one. Example : def test_this(self): if Sub_test_1(): #passes then PASS the Complete test i.e. test_this() and If sub_test_1() fail then run further subtest!) elif run sub_test_2() : #Then PASS test_this() and don't run next test i.e sub_test_3(),sub_test_4() etc) elif run sub_test_3() if sub_test_3() # Then pass test_this() and don't run next test i.e. sub_test_4() ,sub_test_5(). etc) Regards, Ganesh From rhodri at kynesim.co.uk Wed Jul 19 06:29:05 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Wed, 19 Jul 2017 11:29:05 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596f15b0$0$2878$c3e8da3$76491128@news.astraweb.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <75019890-592b-1609-a11b-3ce4b6fab52a@kynesim.co.uk> <596f15b0$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: <2ec7784b-7444-caf0-4006-135bb9427513@kynesim.co.uk> On 19/07/17 09:17, Steven D'Aprano wrote: > On Tue, 18 Jul 2017 16:37:37 +0100, Rhodri James wrote: > >> (For the record, one of my grandmothers would have been baffled by this >> conversation, and the other one would have had definite opinions on >> whether accents were distinct characters or not, followed by a >> digression into whether "?" and "?" should be suppressed vigorously :-) > > > Can I ask what nationality your grandmother was, given that she had an > opinion on the suppression of ? and ?. > > And was she for it or against it? She was a Welsh schoolteacher who went to Australia with her husband. As to what her opinion on ? and ? was, I'm afraid I don't know. It does seem to be one of those things that divides Welsh-speakers; when Acorn were developing their version of extended ASCII in the late 80s, they asked three different University lecturers in Welsh what extra characters they needed, and got three different answers. -- Rhodri James *-* Kynesim Ltd From rosuav at gmail.com Wed Jul 19 06:59:39 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 20:59:39 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87379sbvn9.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> Message-ID: On Wed, Jul 19, 2017 at 7:53 PM, Marko Rauhamaa wrote: > Here's a proposal: > > * introduce a building (predefined) class Text > > * conceptually, a Text object is a sequence of "real" characters > > * you can access each "real" character by its position in O(1) > > * the "real" character is defined to be a integer computed as follows > (in pseudo-Python): > > string = the NFC normal form of the real character as a string > rc = 0 > shift = 0 > for codepoint in string: > rc |= ord(codepoing) << shift > shift += 6 > return rc > > * t[n] evaluates to an integer A string could consist of 1 base character and N-1 combining characters. Can you still access those combined characters in constant time? ChrisA From marko at pacujo.net Wed Jul 19 08:13:33 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 15:13:33 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> Message-ID: <87y3rkaaky.fsf@elektro.pacujo.net> Chris Angelico : > On Wed, Jul 19, 2017 at 7:53 PM, Marko Rauhamaa wrote: >> Here's a proposal: >> >> * introduce a building (predefined) class Text >> >> * conceptually, a Text object is a sequence of "real" characters >> >> * you can access each "real" character by its position in O(1) >> >> * the "real" character is defined to be a integer computed as follows >> (in pseudo-Python): >> >> string = the NFC normal form of the real character as a string >> rc = 0 >> shift = 0 >> for codepoint in string: >> rc |= ord(codepoing) << shift >> shift += 6 >> return rc >> >> * t[n] evaluates to an integer > > A string could consist of 1 base character and N-1 combining > characters. Can you still access those combined characters in constant > time? Yes. Marko From __peter__ at web.de Wed Jul 19 08:24:11 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 19 Jul 2017 14:24:11 +0200 Subject: Best way to assert unit test cases with many conditions References: <2837f5d4cbdf43e7a218fd53f620ab1a@F5.com> Message-ID: Ganesh Pal wrote: > On Tue, Jul 18, 2017 at 11:02 PM, Dan Strohl wrote: > >> >> Like this: >> >> Def test_this(self): >> For i in range(10): >> with self.subTest('test number %s) % i): >> self.assertTrue(I <= 5) >> >> With the subTest() method, if anything within that subTest fails, it >> won't stop the process and will continue with the next step. > Thanks for reading my email and yes you got it right , I am adding bunch > of > same subtest and all are similar and sub test that change only differ in > parameter. > But I can?t use the loop that you have mentioned because I want to achieve > (1) and (2) > (1) I would want my subtest to have a *Condition* based on which it that > would pass my entire test if any of the sub-test passed. Your spec translates to something like: $ cat stop_on_first_success.py import logging import unittest import sys log = logging.getLogger() class T(unittest.TestCase): def test_foo(self): subtests = sorted( name for name in dir(self) if name.startswith("subtest_foo_") ) for name in subtests: method = getattr(self, name) try: method() except Exception as err: log.error(err) else: break else: self.fail("no successful subtest") def subtest_foo_01_int(self): self.assertTrue(isinstance(x, int)) def subtest_foo_02_42(self): self.assertEqual(x, 42) def subtest_foo_03_upper(self): self.assertEqual(x.upper(), x) if __name__ == "__main__": logging.basicConfig() x = sys.argv.pop(1) x = eval(x) print("Running tests with x = {!r}".format(x)) unittest.main() The x = eval() part is only for demonstration purposes. Below's the script output for various incantations. The subtests are executed in alphabetical order of the subtest_foo_xxx method names, failures are logged, and the loop stops after the first success. $ python3 stop_on_first_success.py '"foo"' Running tests with x = 'foo' ERROR:root:False is not true ERROR:root:'foo' != 42 ERROR:root:'FOO' != 'foo' - FOO + foo F ====================================================================== FAIL: test_foo (__main__.T) ---------------------------------------------------------------------- Traceback (most recent call last): File "stop_on_first_success.py", line 22, in test_foo self.fail("no successful subtest") AssertionError: no successful subtest ---------------------------------------------------------------------- Ran 1 test in 0.001s FAILED (failures=1) $ python3 stop_on_first_success.py '"FOO"' Running tests with x = 'FOO' ERROR:root:False is not true ERROR:root:'FOO' != 42 . ---------------------------------------------------------------------- Ran 1 test in 0.001s OK $ python3 stop_on_first_success.py '42' Running tests with x = 42 . ---------------------------------------------------------------------- Ran 1 test in 0.000s OK $ python3 stop_on_first_success.py '42.' Running tests with x = 42.0 ERROR:root:False is not true . ---------------------------------------------------------------------- Ran 1 test in 0.001s OK However, for my taste such a test is both too complex and too vague. If you have code that tries to achieve something in different ways then put these attempts into functions that you can test individually with specific data that causes them to succeed or fail. From rosuav at gmail.com Wed Jul 19 08:56:49 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 19 Jul 2017 22:56:49 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87y3rkaaky.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> Message-ID: On Wed, Jul 19, 2017 at 10:13 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Wed, Jul 19, 2017 at 7:53 PM, Marko Rauhamaa wrote: >>> Here's a proposal: >>> >>> * introduce a building (predefined) class Text >>> >>> * conceptually, a Text object is a sequence of "real" characters >>> >>> * you can access each "real" character by its position in O(1) >>> >>> * the "real" character is defined to be a integer computed as follows >>> (in pseudo-Python): >>> >>> string = the NFC normal form of the real character as a string >>> rc = 0 >>> shift = 0 >>> for codepoint in string: >>> rc |= ord(codepoing) << shift >>> shift += 6 >>> return rc >>> >>> * t[n] evaluates to an integer >> >> A string could consist of 1 base character and N-1 combining >> characters. Can you still access those combined characters in constant >> time? > > Yes. Perhaps we don't have the same understanding of "constant time". Or are you saying that you actually store and represent this as those arbitrary-precision integers? Every character of every string has to be a multiprecision integer? ChrisA From larry.martell at gmail.com Wed Jul 19 09:19:54 2017 From: larry.martell at gmail.com (Larry Martell) Date: Wed, 19 Jul 2017 09:19:54 -0400 Subject: rpy2 Message-ID: Anyone here any experience with the rpy2 package? I am having trouble getting it to install, and I have posted to the rpy mailing list, put a question on SO, and even emailed the author, but I have received no replies. Before I post details I wanted to see if anyone here can possibly help me. From marko at pacujo.net Wed Jul 19 09:42:39 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 16:42:39 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> Message-ID: <87shhsa6gg.fsf@elektro.pacujo.net> Chris Angelico : > Perhaps we don't have the same understanding of "constant time". Or > are you saying that you actually store and represent this as those > arbitrary-precision integers? Every character of every string has to > be a multiprecision integer? Yes, although feel free to optimize. The internal implementation isn't important but those "multiprecision" integers are part of an outward interface. So you could have: >>> for c in Text("aq?u \U0001F64B\U0001F3FF\u200D\u2642\uFE0F"): ... print(c) ... 97 1895826184 117 32 5152920508016097895476141586773579 (Note, though, that Python3 only has integers, there's no "multiprecision" about them.) Marko From grant.b.edwards at gmail.com Wed Jul 19 10:08:33 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Wed, 19 Jul 2017 14:08:33 +0000 (UTC) Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 2017-07-19, Gregory Ewing wrote: > Grant Edwards wrote: >> vacuum, continuum, squush, fortuuned > > Fortuuned? Where did you find that? It was in the scowl-7.1 wordlist I had laying around: http://wordlist.aspell.net/ However, the scowl website now claims not to know about it: http://app.aspell.net/lookup?dict=en_US;words=fortuuned > Google gives me a bizarre set of results, none of which > appear to be an English dictionary definition. Maybe it was a mistaken spelling of 'fortuned'? -- Grant Edwards grant.b.edwards Yow! PEGGY FLEMMING is at stealing BASKET BALLS to gmail.com feed the babies in VERMONT. From marko at pacujo.net Wed Jul 19 10:19:39 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Wed, 19 Jul 2017 17:19:39 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: <878tjka4qs.fsf@elektro.pacujo.net> Grant Edwards : > On 2017-07-19, Gregory Ewing wrote: >> Grant Edwards wrote: >>> vacuum, continuum, squush, fortuuned >> >> Fortuuned? Where did you find that? > > It was in the scowl-7.1 wordlist I had laying around: > > http://wordlist.aspell.net/ > > However, the scowl website now claims not to know about it: > > http://app.aspell.net/lookup?dict=en_US;words=fortuuned Finnish is well-endowed in that respect: suu puu kuu huuli tuuli uusi uuni kaipuu ... In a great irony of fate, though, Finnish completely lacks the "w"! Marko From felipe.bastosn at gmail.com Wed Jul 19 10:49:35 2017 From: felipe.bastosn at gmail.com (Felipe Bastos Nunes) Date: Wed, 19 Jul 2017 14:49:35 +0000 Subject: rpy2 In-Reply-To: References: Message-ID: Hey, I have no experience on it, but maybe I'm able to help. How are you tryin to install it? Pip? 2 or 3? Virtualenv? Cya! Em qua, 19 de jul de 2017 10:22, Larry Martell escreveu: > Anyone here any experience with the rpy2 package? I am having trouble > getting it to install, and I have posted to the rpy mailing list, put > a question on SO, and even emailed the author, but I have received no > replies. Before I post details I wanted to see if anyone here can > possibly help me. > -- > https://mail.python.org/mailman/listinfo/python-list > From felipe.bastosn at gmail.com Wed Jul 19 10:54:03 2017 From: felipe.bastosn at gmail.com (Felipe Bastos Nunes) Date: Wed, 19 Jul 2017 14:54:03 +0000 Subject: Problem in installing module "pynamical" In-Reply-To: References: Message-ID: Hi! I have no experience on that, but I may ask something that might help: 1- are you aware that it's not safe to use pip directly? It could be safer to use pip install --user package. 2- Are you able to use virtualenv? This usually let me install packages my system has problems on working out. Cya! Em ter, 18 de jul de 2017 19:00, Saikat Chakraborty escreveu: > I am using PyCharm Community Edition 2017 with interpreter python 3.6.1. > I want to install pynamical module. > But it is showing error. I am posting the error message: > > E:\untitled>pip install pynamical > > FileNotFoundError: [WinError 2] The system cannot find the file specified > error: command > > 'c:\\users\\s.chakraborty\\appdata\\local\\programs\\python\\python36\\python.exe' > failed with exit status 1 > > Please give me a solutioin. > > Thanking you. > -- > With Regards > Saikat Chakraborty > (Doctoral Research Scholar) > *Computer Science & Engineering Dept.* > * NIT Rourkela,Rourkela,Orissa, India* > -- > https://mail.python.org/mailman/listinfo/python-list > From rosuav at gmail.com Wed Jul 19 11:02:29 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 20 Jul 2017 01:02:29 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87shhsa6gg.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> Message-ID: On Wed, Jul 19, 2017 at 11:42 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> Perhaps we don't have the same understanding of "constant time". Or >> are you saying that you actually store and represent this as those >> arbitrary-precision integers? Every character of every string has to >> be a multiprecision integer? > > Yes, although feel free to optimize. The internal implementation isn't > important but those "multiprecision" integers are part of an outward > interface. So you could have: > > >>> for c in Text("aq?u \U0001F64B\U0001F3FF\u200D\u2642\uFE0F"): > ... print(c) > ... > 97 > 1895826184 > 117 > 32 > 5152920508016097895476141586773579 > > (Note, though, that Python3 only has integers, there's no > "multiprecision" about them.) I said "multiprecision" because that's what the low-level arbitrary-precision-integer library calls them - GNU Multiprecision Integer [1]. Somehow you're going to have to store those in an indexable way, and since you can't fit arbitrary-precision data into fixed-width slots, O(1) addressing is going to require external storage. Basically, you're representing a string as if it were a tuple of integers. That makes fine sense semantically, but it's pretty costly: >>> sys.getsizeof("hello, world") 61 >>> sys.getsizeof(tuple("hello, world")) 144 That's just for the tuple itself; then you need to represent the actual numbers. Each one will require an addressable memory allocation, which basically means a minimum of 8 bytes (on my 64-bit system; you'd save a bit on a 32-bit Python, but most people don't use those now). So every character in your string requires an 8-byte pointer plus an 8-byte value. In contrast, current CPython requires *at most* four bytes per character, and for many strings, requires only one byte per character - possible only because the data is kept as an array, not as external references. Now, this is a performance question, and it's not unreasonable to talk about semantics first and let performance wait for later. But when you consider how many ASCII-only strings Python uses internally (the names of basically every global function and every attribute in every stdlib module), and how you'll be enlarging those by a factor of 16 *and* making every character lookup require two pointer reads, it's pretty much a non-starter. There MIGHT be something to be done using a sentinel value that represents "the actual value is somewhere else". However, I'm not sure how you could do that cleanly in a one-byte-per-char string other than maintaining some external table. So here's the best I can come up with for efficiency - and it suffers horrendously from complexity: * Strings with all codepoints < 256 are represented as they currently are (one byte per char). There are no combining characters in the first 256 codepoints anyway. * Strings with all codepoints < 65536 and no combining characters, ditto (two bytes per char). * Strings with any combining characters in them are stored in four bytes per char even if all codepoints are <65536. * Any time a character consists of a single base with no combining, it is stored in UTF-32. * Combined characters are stored in the primary array as 0x80000000 plus the index into a secondary array where these values are stored. * The secondary array has a pointer for each combined character (ignoring single-code-point characters), probably to a Python integer object for simplicity. This scheme allows a maximum of two billion combined characters in any string. Worst case, "a\u0303"*0x80000000 is a four billion character string that simply can't be represented; but long before that, you'll run out of space to allocate all those large integers. (Current CPython represents that string in 8GB of memory. Enough to push me into the swapper - I have only 16GB in this system and a lot of it is in use - but nothing I can't handle.) It also has the advantage that most strings won't change in representation. However, the complexity is insane; I don't want to be the one to write all the unit tests to make sure everything behaves as advertised! Also, this system has the nasty implication that the creation of a new combining character will fundamentally change the way a string behaves. That means that running a slightly older version of Python could potentially cause, not errors, but subtly different behaviour. With Python 2.7, 3.4, 3.5, 3.6, and 3.7, I have four different major Unicode versions, which means plenty of potential for newly-allocated codepoints in newer Pythons. That's not usually a problem, as it only affects a few things in the unicodedata module: rosuav at sikorsky:~$ python3.4 -c "import unicodedata; print(unicodedata.name('\U0001f917'))" Traceback (most recent call last): File "", line 1, in ValueError: no such name rosuav at sikorsky:~$ python3.5 -c "import unicodedata; print(unicodedata.name('\U0001f917'))" HUGGING FACE rosuav at sikorsky:~$ python3.6 -c "import unicodedata; print(unicodedata.name('\u1df6'))" Traceback (most recent call last): File "", line 1, in ValueError: no such name rosuav at sikorsky:~$ python3.7 -c "import unicodedata; print(unicodedata.name('\u1df6'))" COMBINING KAVYKA ABOVE RIGHT But if combining characters behave fundamentally differently to others, there would be a change in string representation when U+1DF6 became a combining character. That's going to cause MASSIVE upheaval. I don't think there's any solution to that, but if you can find one, do please elaborate. ChrisA [1] https://gmplib.org/ From larry.martell at gmail.com Wed Jul 19 11:15:03 2017 From: larry.martell at gmail.com (Larry Martell) Date: Wed, 19 Jul 2017 11:15:03 -0400 Subject: rpy2 In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 10:49 AM, Felipe Bastos Nunes wrote: > Hey, I have no experience on it, but maybe I'm able to help. How are you > tryin to install it? Pip? 2 or 3? Virtualenv? python2.7, using pip, on Redhat 6, R version 3.3.3: Collecting rpy2 Using cached rpy2-2.8.6.tar.gz Requirement already satisfied (use --upgrade to upgrade): six in /usr/local/lib/python2.7/site-packages (from rpy2) Requirement already satisfied (use --upgrade to upgrade): singledispatch in /usr/local/lib/python2.7/site-packages (from rpy2) Installing collected packages: rpy2 Running setup.py install for rpy2 ... error Complete output from command /usr/local/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-eFwk3n/rpy2/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-w5v7yg-record/install-record.txt --single-version-externally-managed --compile: R version 3.3.3 (2017-03-06) -- "Another Canoe" /usr/local/lib64/R/bin/R CMD config --ldflags /usr/local/lib64/R/bin/R CMD config --cppflags Compilation parameters for rpy2's C components: include_dirs = ['/usr/local/lib64/R/include'] library_dirs = ['/usr/local/lib64/R/lib'] libraries = ['Rblas', 'gfortran', 'm', 'readline', 'pcre', 'lzma', 'bz2', 'z', 'rt', 'dl', 'm'] extra_link_args = ['-Wl,--export-dynamic', '-fopenmp', '-Wl,--whole-archive', '/usr/local/lib64/R/lib/libR.a', '-Wl,--no-whole-archive'] running install running build running build_py creating build creating build/lib.linux-x86_64-2.7 creating build/lib.linux-x86_64-2.7/rpy2 copying ./rpy/__init__.py -> build/lib.linux-x86_64-2.7/rpy2 copying ./rpy/tests_rpy_classic.py -> build/lib.linux-x86_64-2.7/rpy2 creating build/temp.linux-x86_64-2.7/rpy creating build/temp.linux-x86_64-2.7/rpy/rinterface gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./rpy/rinterface -I/usr/local/lib64/R/include -c ./rpy/rinterface/r_utils.c -o build/temp.linux-x86_64-2.7/./rpy/rinterface/r_utils.o In file included from /usr/local/lib64/R/include/Rdefines.h:36, from ./rpy/rinterface/r_utils.c:23: /usr/local/lib64/R/include/R_ext/Memory.h:51: warning: function declaration isn?t a prototype In file included from /usr/local/lib64/R/include/Rdefines.h:40, from ./rpy/rinterface/r_utils.c:23: /usr/local/lib64/R/include/Rinternals.h:886: warning: function declaration isn?t a prototype ar rc build/temp.linux-x86_64-2.7/libr_utils.a build/temp.linux-x86_64-2.7/./rpy/rinterface/r_utils.o running build_ext R version 3.3.3 (2017-03-06) -- "Another Canoe" building 'rpy2.rinterface._rinterface' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DR_INTERFACE_PTRS=1 -DHAVE_POSIX_SIGJMP=1 -DRIF_HAS_RSIGHAND=1 -DCSTACK_DEFNS=1 -DHAS_READLINE=1 -I./rpy/rinterface -I/usr/local/lib64/R/include -I/usr/local/include/python2.7 -c ./rpy/rinterface/_rinterface.c -o build/temp.linux-x86_64-2.7/./rpy/rinterface/_rinterface.o In file included from /usr/local/include/python2.7/Python.h:8, from ./rpy/rinterface/_rinterface.c:49: /usr/local/include/python2.7/pyconfig.h:1188:1: warning: "_POSIX_C_SOURCE" redefined In file included from /usr/include/signal.h:29, from ./rpy/rinterface/_rinterface.c:45: /usr/include/features.h:213:1: warning: this is the location of the previous definition In file included from /usr/local/lib64/R/include/R.h:80, from ./rpy/rinterface/_rinterface.h:8, from ./rpy/rinterface/_rinterface.c:52: /usr/local/lib64/R/include/R_ext/Memory.h:51: warning: function declaration isn?t a prototype In file included from ./rpy/rinterface/_rinterface.h:9, from ./rpy/rinterface/_rinterface.c:52: /usr/local/lib64/R/include/Rinternals.h:886: warning: function declaration isn?t a prototype In file included from ./rpy/rinterface/_rinterface.c:64: /usr/local/lib64/R/include/Rinterface.h:149: warning: function declaration isn?t a prototype In file included from ./rpy/rinterface/_rinterface.c:73: /usr/local/lib64/R/include/R_ext/Rdynload.h:36: warning: function declaration isn?t a prototype In file included from ./rpy/rinterface/_rinterface.c:116: ./rpy/rinterface/embeddedr.c: In function ?SexpObject_clear?: ./rpy/rinterface/embeddedr.c:48: warning: unused variable ?res? In file included from ./rpy/rinterface/_rinterface.c:119: ./rpy/rinterface/sexp.c: In function ?Sexp_init?: ./rpy/rinterface/sexp.c:742: warning: unused variable ?copy? ./rpy/rinterface/_rinterface.c: At top level: ./rpy/rinterface/_rinterface.h:203: warning: ?PyRinterface_IsInitialized? declared ?static? but never defined ./rpy/rinterface/_rinterface.h:204: warning: ?PyRinterface_FindFun? declared ?static? but never defined ./rpy/rinterface/_rinterface.h:205: warning: ?embeddedR_isInitialized? defined but not used ./rpy/rinterface/sequence.c:2173: warning: ?ComplexVectorSexp_AsSexp? defined but not used ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_ShowFiles?: ./rpy/rinterface/_rinterface.c:831: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_ResetConsole?: ./rpy/rinterface/_rinterface.c:677: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_FlushConsole?: ./rpy/rinterface/_rinterface.c:643: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_ChooseFile?: ./rpy/rinterface/_rinterface.c:727: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_ReadConsole?: ./rpy/rinterface/_rinterface.c:498: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_WriteConsoleEx?: ./rpy/rinterface/_rinterface.c:339: warning: ?consolecallback? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c:354: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_ShowMessage?: ./rpy/rinterface/_rinterface.c:429: warning: ?gstate? may be used uninitialized in this function ./rpy/rinterface/_rinterface.c: In function ?EmbeddedR_CleanUp?: ./rpy/rinterface/_rinterface.c:979: warning: ?gstate? may be used uninitialized in this function gcc -pthread -shared build/temp.linux-x86_64-2.7/./rpy/rinterface/_rinterface.o -L/usr/local/lib64/R/lib -Lbuild/temp.linux-x86_64-2.7 -Wl,-R/usr/local/lib64/R/lib -lRblas -lgfortran -lm -lreadline -lpcre -llzma -lbz2 -lz -lrt -ldl -lm -lr_utils -o build/lib.linux-x86_64-2.7/rpy2/rinterface/_rinterface.so -Wl,--export-dynamic -fopenmp -Wl,--whole-archive /usr/local/lib64/R/lib/libR.a -Wl,--no-whole-archive /usr/bin/ld: /usr/local/lib64/R/lib/libR.a(CommandLineArgs.o): relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC /usr/local/lib64/R/lib/libR.a(CommandLineArgs.o): could not read symbols: Bad value collect2: ld returned 1 exit status error: command 'gcc' failed with exit status 1 ---------------------------------------- Command "/usr/local/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-eFwk3n/rpy2/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-w5v7yg-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-eFwk3n/rpy2/ Also tried building from source and that failed with: gcc -pthread -shared build/temp.linux-x86_64-2.7/./rpy/rinterface/_rinterface.o -L/usr/local/lib64/R/lib -Lbuild/temp.linux-x86_64-2.7 -Wl,-R/usr/local/lib64/R/lib -lRblas -lgfortran -lm -lreadline -lpcre -llzma -lbz2 -lz -lrt -ldl -lm -lr_utils -o build/lib.linux-x86_64-2.7/rpy2/rinterface/_rinterface.so -Wl,--export-dynamic -fopenmp -Wl,--whole-archive /usr/local/lib64/R/lib/libR.a -Wl,--no-whole-archive /usr/bin/ld: cannot find -lr_utils But I could not find how to get libr_utils. > Em qua, 19 de jul de 2017 10:22, Larry Martell > escreveu: >> >> Anyone here any experience with the rpy2 package? I am having trouble >> getting it to install, and I have posted to the rpy mailing list, put >> a question on SO, and even emailed the author, but I have received no >> replies. Before I post details I wanted to see if anyone here can >> possibly help me. >> -- >> https://mail.python.org/mailman/listinfo/python-list From random832 at fastmail.com Wed Jul 19 11:27:17 2017 From: random832 at fastmail.com (Random832) Date: Wed, 19 Jul 2017 11:27:17 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> Message-ID: <1500478037.1904126.1046003616.011E5CB9@webmail.messagingengine.com> On Tue, Jul 18, 2017, at 19:21, Gregory Ewing wrote: > Random832 wrote: > > What about Emoji? > > U+1F469 WOMAN is two columns wide on its own. > > U+1F4BB PERSONAL COMPUTER is two columns wide on its own. > > The term "emoji" is becoming rather strained these days. > The idea of "woman" and "personal computer" being emotions > is an interesting one... Emoji comes from Japanese ??? - ?(E) picture, ??(moji) character. It is not in fact etymologically related to the native English term "emoticon", which is no longer in common usage. From random832 at fastmail.com Wed Jul 19 11:30:18 2017 From: random832 at fastmail.com (Random832) Date: Wed, 19 Jul 2017 11:30:18 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596ec8b1$0$1596$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> <596ec8b1$0$1596$c3e8da3$5496439d@news.astraweb.com> Message-ID: <1500478218.1904557.1046006840.7488E265@webmail.messagingengine.com> On Tue, Jul 18, 2017, at 22:49, Steve D'Aprano wrote: > > What about Emoji? > > U+1F469 WOMAN is two columns wide on its own. > > U+1F4BB PERSONAL COMPUTER is two columns wide on its own. > > U+200D ZERO WIDTH JOINER is zero columns wide on its own. > > > What about them? In a monospaced font, they should follow the same rules > I used > above: either 0, 1 or 2 column wide. You snipped the important part - the fact that the whole sequence of three code points U+1F469 U+200D U+1F4BB is a single grapheme cluster two columns wide. You also ignored all of the other examples in my post. Did you even read anything beyond what you snipped? From wegge at wegge.dk Wed Jul 19 11:35:28 2017 From: wegge at wegge.dk (Anders Wegge Keller) Date: Wed, 19 Jul 2017 17:35:28 +0200 Subject: SIGSEGV and SIGILL inside PyCFunction_Call Message-ID: <20170719173528.4ecc3f78@wegge.dk> I have an ongoing issue with my usenet setup. I'm that one dude who don't want to learn perl. That means that I have to build inn from source, so I can enable the python interpreter. That's not so bad, and the errors that show up have been something that I have been able to figure out by myself. At least up until now. I have an almost 100% repeatable crash, when nnrpd performs the user authentication step. Backtracing the core dum gives this: #0 0x0000564a864e2d63 in ?? () #1 0x00007f9609567091 in call_function (oparg=, pp_stack=0x7ffda2d801b0) at ../Python/ceval.c:4352 Note: Line 4352 C_TRACE(x, PyCFunction_Call(func,callargs,NULL)); #2 PyEval_EvalFrameEx ( f=Frame 0x7f9604758050, for file /etc/news/filter/nnrpd_auth.py, line 67, in __init__ (self= <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> Message-ID: <87bmoge8h7.fsf@elektro.pacujo.net> Chris Angelico : > Now, this is a performance question, and it's not unreasonable to talk > about semantics first and let performance wait for later. But when you > consider how many ASCII-only strings Python uses internally (the names > of basically every global function and every attribute in every stdlib > module), and how you'll be enlarging those by a factor of 16 *and* > making every character lookup require two pointer reads, it's pretty > much a non-starter. It's not that difficult or costly. > Also, this system has the nasty implication that the creation of a new > combining character will fundamentally change the way a string > behaves. If you go with a new Text class, you don't face any backward-compatibility issues. If you go with expanding str, you can run into some minor issues. > But if combining characters behave fundamentally differently to > others, there would be a change in string representation when U+1DF6 > became a combining character. That's going to cause MASSIVE upheaval. > I don't think there's any solution to that, but if you can find one, > do please elaborate. So let's assume we will expand str to accommodate the requirements of grapheme clusters. All existing code would still produce only traditional strings. The only way to introduce the new "super code points" is by invoking the str.canonical() method: text = "hyv?? y?t?".canonical() In this case text would still be a fully traditional string because both ? and ? are represented by a single code point in NFC. However: >>> q = unicodedata.normalize("NFC", "aq?u") >>> len(q) 4 >>> text = q.canonical() >>> len(text) 3 >>> t[0] "a" >>> t[1] "q?" >>> t[2] "u" >>> q2 = unicodedata.normalize("NFC", text) >>> len(q2) 4 >>> text.encode() b'aq\xcc\x88u' >>> q.encode() b'aq\xcc\x88u' We *could* also add a literal notation for canonical strings: >>> re.match(rc"[qq?]x", c"q?x") ... Of course, str.canonical() could be expressed as: >>> len(unicode.normalize("Python-Canonical", q)) 3 but I think str.canonical() would deserve a place in the, well, canon. Marko From rosuav at gmail.com Wed Jul 19 11:59:11 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 20 Jul 2017 01:59:11 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <87bmoge8h7.fsf@elektro.pacujo.net> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <87bmoge8h7.fsf@elektro.pacujo.net> Message-ID: On Thu, Jul 20, 2017 at 1:45 AM, Marko Rauhamaa wrote: > So let's assume we will expand str to accommodate the requirements of > grapheme clusters. > > All existing code would still produce only traditional strings. The only > way to introduce the new "super code points" is by invoking the > str.canonical() method: > > text = "hyv?? y?t?".canonical() > > In this case text would still be a fully traditional string because both > ? and ? are represented by a single code point in NFC. However: > > >>> q = unicodedata.normalize("NFC", "aq?u") > >>> len(q) > 4 > >>> text = q.canonical() > >>> len(text) > 3 > >>> t[0] > "a" > >>> t[1] > "q?" > >>> t[2] > "u" > >>> q2 = unicodedata.normalize("NFC", text) > >>> len(q2) > 4 > >>> text.encode() > b'aq\xcc\x88u' > >>> q.encode() > b'aq\xcc\x88u' Ahh, I see what you're looking at. This is fundamentally very similar to what was suggested a few hundred posts ago: a function in the unicodedata module which yields a string's combined characters as units. So you only see this when you actually want it, and the process of creating it is a form of iterating over the string. This could easily be done, as a class or function in unicodedata, without any language-level support. It might even already exist on PyPI. ChrisA From tjol at tjol.eu Wed Jul 19 13:41:10 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Wed, 19 Jul 2017 19:41:10 +0200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <87h8y9e8mk.fsf@elektro.pacujo.net> Message-ID: <696feb95-e428-d750-3dd6-7ff5a9cebe98@tjol.eu> On 19/07/17 04:19, Rustom Mody wrote: > On Wednesday, July 19, 2017 at 3:00:21 AM UTC+5:30, Marko Rauhamaa wrote: >> Chris Angelico : >> >>> Let me give you one concrete example: the letter "?". In English, it >>> is (very occasionally) used to indicate diaeresis, where a pair of >>> letters is not a double letter - for example, "co?perate". (You can >>> also hyphenate, "co-operate".) In German, it is the letter "o" with a >>> pronunciation mark (umlaut), and is considered the same letter as "o". >>> In Swedish, it is a distinct letter, alphabetized last (following z, >>> ?, and ?, in that order). But in all these languages, it's represented >>> the exact same way. >> The German Wikipedia entry on "?" calls "?" a letter ("Buchstabe"): >> >> Der Buchstabe ? (kleingeschrieben ?) ist ein Buchstabe des >> lateinischen Schriftsystems. >> >> Furthermore, it makes a distinction between "?" the letter and "?" the >> "a with a diaeresis:" >> >> In guten Druckschriften unterscheiden sich die Umlautpunkte von den >> zwei Punkten des Tremas: Die Umlautpunkte sind kleiner, stehen n?her >> zusammen und liegen etwas tiefer. >> >> In good fonts umlaut dots are different from the two dots of a >> diaeresis: the umlaut dots are smaller and closer to each other and >> lie a little lower. [translation mine] >> > Very interesting! > And may I take it that the two different variants ? u-umlaut and u-diaresis ? of ? are not (yet) given a seat in unicode? Yes, the tr?ma/di?resis and the umlaut are two historically distinct beasts that share appearances and codepoints. (And the question of whether ???? are letters in German is rather more subtle than whether ??? are letters in Swedish) For added confusion there are languages like Dutch which use both the umlaut (in German loanwords like ??berhaupt?) and the tr?ma (in words like vacu?m). Other languages, like Turkish, use the umlaut symbol for separate vowels that are not umlauts (i.e. shifted vowels, like mouse - mice / Maus - M?use) So let's just pretend that characters in general have no meaning? > Now compare with: > - hyphen-minus 0x2D > ? minus sign 0x2212 > ? hyphen 0x2010 > ? en dash 0x2013 > ? em dash 0x2014 > ? horizontal bar 0x2015 > ? And perhaps another half-dozen ? but then again there's the whole business of Han unification. -- Thomas From tjreedy at udel.edu Wed Jul 19 14:01:41 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jul 2017 14:01:41 -0400 Subject: Best way to assert unit test cases with many conditions In-Reply-To: References: <2837f5d4cbdf43e7a218fd53f620ab1a@F5.com> Message-ID: On 7/19/2017 8:24 AM, Peter Otten wrote: > Ganesh Pal wrote: >> (1) I would want my subtest to have a *Condition* based on which it that >> would pass my entire test if any of the sub-test passed. If I understand correctly, you want assertTrue(subtest1 or subtest2 or subtest3 or subtest4 ...) or assertTrue(any(iterable_of_subtests)) Each 'subtestn' can be an assertion or expression or function call. Peter's code below implements the general idea above in the any form with function calls in your particular situation where you also want to log subtest failures without failing the overall test. (The 'any' builtin or's together an indefinite number of items. The 'all' builtin and's multiple items.) > Your spec translates to something like: > > $ cat stop_on_first_success.py > import logging > > import unittest > import sys > > log = logging.getLogger() > > class T(unittest.TestCase): > def test_foo(self): > subtests = sorted( > name for name in dir(self) if name.startswith("subtest_foo_") > ) > for name in subtests: > method = getattr(self, name) > try: > method() > except Exception as err: > log.error(err) > else: > break > else: > self.fail("no successful subtest") > > def subtest_foo_01_int(self): > self.assertTrue(isinstance(x, int)) > def subtest_foo_02_42(self): > self.assertEqual(x, 42) > def subtest_foo_03_upper(self): > self.assertEqual(x.upper(), x) > > if __name__ == "__main__": > logging.basicConfig() > > x = sys.argv.pop(1) > x = eval(x) > print("Running tests with x = {!r}".format(x)) > > unittest.main() > > The x = eval() part is only for demonstration purposes. > > Below's the script output for various incantations. The subtests are > executed in alphabetical order of the subtest_foo_xxx method names, failures > are logged, and the loop stops after the first success. > > $ python3 stop_on_first_success.py '"foo"' > Running tests with x = 'foo' > ERROR:root:False is not true > ERROR:root:'foo' != 42 > ERROR:root:'FOO' != 'foo' > - FOO > + foo > > F > ====================================================================== > FAIL: test_foo (__main__.T) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "stop_on_first_success.py", line 22, in test_foo > self.fail("no successful subtest") > AssertionError: no successful subtest > > ---------------------------------------------------------------------- > Ran 1 test in 0.001s > > FAILED (failures=1) > $ python3 stop_on_first_success.py '"FOO"' > Running tests with x = 'FOO' > ERROR:root:False is not true > ERROR:root:'FOO' != 42 > . > ---------------------------------------------------------------------- > Ran 1 test in 0.001s > > OK > $ python3 stop_on_first_success.py '42' > Running tests with x = 42 > . > ---------------------------------------------------------------------- > Ran 1 test in 0.000s > > OK > $ python3 stop_on_first_success.py '42.' > Running tests with x = 42.0 > ERROR:root:False is not true > . > ---------------------------------------------------------------------- > Ran 1 test in 0.001s > > OK > > However, for my taste such a test is both too complex and too vague. If you > have code that tries to achieve something in different ways then put these > attempts into functions that you can test individually with specific data > that causes them to succeed or fail. > > -- Terry Jan Reedy From python at mrabarnett.plus.com Wed Jul 19 14:13:18 2017 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 19 Jul 2017 19:13:18 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <877ez4bzi1.fsf@elektro.pacujo.net> References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> <20170718162324.100e2c8f@wegge.dk> <1500388728.4046340.1044707048.3A4E8251@webmail.messagingengine.com> <8760epvijl.fsf@elektro.pacujo.net> <877ez4bzi1.fsf@elektro.pacujo.net> Message-ID: <0995acec-f597-ac62-9ca8-e19c68f032b1@mrabarnett.plus.com> On 2017-07-19 09:29, Marko Rauhamaa wrote: > Gregory Ewing : > >> Marko Rauhamaa wrote: >>> * a final "v" receives a superfluous "e" ("love") >> >> It's not superfluous there, it's preventing "love" from looking like >> it should rhyme with "of". > > I'm pretty sure that wasn't the original motivation. If I had to guess, > the reason was the possible visual confusion with "w". > > An interesting tidbit is that the English spelling demonstrates how the > [o] sound regularly shifted into an [?] sound in front of nasals and > "v": > > dove > love > hover In UK English, "hover" has the short-O sound. > cover > shove > above > > sponge > come > among > front > done > son > monk > monkey > > Again, exceptions abound: > > on > wrong > song > gone > long > Also: cove stove and then there's: move which is different again. From rgaddi at highlandtechnology.invalid Wed Jul 19 14:14:51 2017 From: rgaddi at highlandtechnology.invalid (Rob Gaddi) Date: Wed, 19 Jul 2017 11:14:51 -0700 Subject: Best way to assert unit test cases with many conditions In-Reply-To: References: Message-ID: On 07/19/2017 03:26 AM, Ganesh Pal wrote: >> >> Yes. Just assert each thing as it needs asserting. >> >> > Asserting each sub test will fail the entire test, I want the to pass > the test if any the sub test passes. If the sub test fail try all cases > and fail for the last one. > > Example : > > > > def test_this(self): > > if Sub_test_1(): > > #passes then PASS the Complete test i.e. test_this() and If > sub_test_1() fail then run further subtest!) > > elif run sub_test_2() : > > #Then PASS test_this() and don't run next test i.e > sub_test_3(),sub_test_4() etc) > > elif run sub_test_3() > > if sub_test_3() > > # Then pass test_this() and don't run next test i.e. sub_test_4() > ,sub_test_5(). etc) > > > > Regards, > > Ganesh > So you're saying if test 1 passes you don't even bother to run test 2? To be blunt, that sounds convoluted and overcomplicated. How would you ever know that test2 is even doing its job? Why is this superior to writing five tests, all of which always run? Note that "runtime" is not a valid answer unless you're talking about multiple minutes of it. -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. From rgaddi at highlandtechnology.invalid Wed Jul 19 14:25:25 2017 From: rgaddi at highlandtechnology.invalid (Rob Gaddi) Date: Wed, 19 Jul 2017 11:25:25 -0700 Subject: pyserial and end-of-line specification In-Reply-To: <4410bbc7-a57a-4b75-9f62-eb15df7e92b5@googlegroups.com> References: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> <4410bbc7-a57a-4b75-9f62-eb15df7e92b5@googlegroups.com> Message-ID: On 07/18/2017 12:53 PM, FS wrote: > Thank you for your response Andre. I had tried some code like that in the document but it did not seem to work. However ever leaving my terminal for a time the code eventually wrote out the records so apparently there is some very deep buffering going on here. A little more searching on the web revealed the following: > > https://stackoverflow.com/questions/10222788/line-buffered-serial-input > > It is apparent that pySerial, or at least the documentation is falling short of my needs. It is very unclear what module in the layer is handling the buffering and newlines and so forth. Also unclear is whether the coupled python and OS is reading FIFO or LIFO--something important in quasi realtime scientific applications. > This is problematic since the serial port is still so ubiquitous to a lot of scientific instrumentation. I probably will patch up some byte oriented code for this or perhaps write the module in C. > > Thanks again > Fritz > Handling it .read(1) at a time is probably your best bet. Append them into a bytearray and pull it all out when you're done. It's a serial port; it's not like there is any degree of inefficiently that you could write the code that will make it slow with respect to the I/O. I write a LOT of serial instrument I/O and I've definitely had to fall back to this plan. Your code gets a little long, you hide it all in a function somewhere and never think on it again. One paradigm I stick for ASCII serial is to have 3 functions: def command(msg: str): """Sends a command, raises CommError if it doesn't get some expected OK sort of thing.""" def query(msg: str): """Sends a commmand, returns a (trimmed) response line.""" def _communicate(msg: str): The ugliest stuff, all the str->bytes->str stuff, the line-ending and protocols, goes into _communicate. Query usually just calls _communicate. Command slaps on whatever checks are needed. It feels a bit heavy, but it leads to highly-usable code and makes it easy to integrate logging, retries, integrating "*OPC?" handshakes, whatever sort of things turn out to be necessary on a given device. -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. From mikhailwas at gmail.com Wed Jul 19 14:34:53 2017 From: mikhailwas at gmail.com (Mikhail V) Date: Wed, 19 Jul 2017 20:34:53 +0200 Subject: Grapheme clusters, a.k.a.real characters Message-ID: Steven D'Aprano wrote: >On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote: >> Ok, in this narrow context I can also agree. >> But in slightly wider context that phrase may sound almost like: >> "neither geometrical shape is better than the other as a basis >> for a wheel. If you have polygonal wheels, they are still called wheels." > I'm not talking about wheels, I'm talking about writing systems which are > fundamentally collections of arbitrary shapes. There's nothing about the sound > of "f" that looks like the letter "f". > But since you mentioned non-circular wheels, such things do exist, and are still > called "wheels" (or "gears", which is a kind of specialised wheel). > https://eric.ed.gov/?id=EJ937593 > https://en.wikipedia.org/wiki/Non-circular_gear > https://en.wikipedia.org/wiki/Square_wheel > https://www.youtube.com/watch?v=vk7s4PfvCZg Triangular wheels, sure, why not? A default "wheel" in a conversation, unless other meaning stated, is a common wheel of a bike or a car. At least I believe so, but since I'm non-native speaker I may be wrong. As well as the default merit of goodness of a writing system is how easy one can read texts in it (_a healthy person, done with the learning process_). Fundamentally, yes, a system in theory can be a set of _any_ shapes. This means that its goodness, in respect to the shapes alone, can vary from absolute zero (as e.g. in a hand-written recipe from a doctor :) and up to the optimum domain. Even if we take more obvious criteria - the ease of input - I suppose it is obvious that inputting German text by rules which need initial Caps in _all_ nouns, is harder than inputting the same text without Caps. Same for inputting diacritics. It is also pretty obvious that these Caps makes it harder to read in general. (more obvious that excessive diacritics, like in French) Thus, even in a narrow context, "no system is better or worse" sounds very suspect to me. Mikhail From tjreedy at udel.edu Wed Jul 19 14:36:17 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Jul 2017 14:36:17 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596f1848$0$2878$c3e8da3$76491128@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <1500387099.4041269.1044669328.2C241A9E@webmail.messagingengine.com> <596f1848$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: On 7/19/2017 4:28 AM, Steven D'Aprano wrote: > On Tue, 18 Jul 2017 10:11:39 -0400, Random832 wrote: > >> On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote: >>> Consider, for example, a Python source code >>> editor where you want to limit the length of the line based on the >>> number of characters more typically than based on the number of pixels. >> >> Even there you need to go based on the width in character cells. Most >> characters for East Asian languages occupy two character cells. >> >> It would be nice if there was an easy way to get str.format to use this >> width instead of the length in code points for the purpose of padding. > > You could always put in a feature request :-) I believe that there is a request that at least one of the string functions be character width aware, using the unicodedatabase. -- Terry Jan Reedy From rantingrickjohnson at gmail.com Wed Jul 19 17:51:28 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 14:51:28 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596ecc4d$0$1595$c3e8da3$5496439d@news.astraweb.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <596ecc4d$0$1595$c3e8da3$5496439d@news.astraweb.com> Message-ID: <52c2a26e-e398-4a5a-83d8-828e556eba7f@googlegroups.com> On Tuesday, July 18, 2017 at 10:07:41 PM UTC-5, Steve D'Aprano wrote: > On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote: [...] > > Einstein: If you can't explain something to a six-year- > > old, you really don't understand it yourself. > > > > [...] > > Think about it: it simply is nonsense. If this six year old > test was valid, that would imply that all fields of > knowledge are capable of being taught to the average six > year old. Yeah good luck with that. Again, as was the case with your Toupee Fallacy a few days ago, you've got it all wrong. The implication of that quote was _not_ that six year olds are the "final arbiters of truth". LOL. The implication is that explaining anything to a six year old is not an easy task. Therefore, a teacher who lacks a deep understanding of the subject matter could never hope to properly educate a six year old student. And if you don't believe me, consider the lesson of this famous quip: "The blind leading the blind". ;-) > But even if we accept this, it doesn't contradict the > Mencken quote. I can explain the birds and the bees to a > six year, at a level that they will understand. That > doesn't mean that (1) I am an expert on human reproduction; Since when is a biology degree prerequisite to informing a six year old that babies are the product of "mommies and daddies"? (at least "historically speaking") Of course, in the not-so-distant-future, babies will be the product of "science and farming". Hmm. Which, incidentally, is a wonderful segue into the subject of evolution! From greg.ewing at canterbury.ac.nz Wed Jul 19 18:04:29 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Thu, 20 Jul 2017 10:04:29 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596e08e7$0$1611$c3e8da3$5496439d@news.astraweb.com> Message-ID: Grant Edwards wrote: > Maybe it was a mistaken spelling of 'fortuned'? Most likely. Interestingly, several sites claimed to be able to tell me things about it. One of them tried to find poetry related to it (didn't find any, though). Another one offered to show me how to pronounce it, and it kind of did, although it sounded suspiciously like it was generated by a text-to-speech algorithm... -- Greg From greg.ewing at canterbury.ac.nz Wed Jul 19 18:12:09 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Thu, 20 Jul 2017 10:12:09 +1200 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> Message-ID: Chris Angelico wrote: > * Strings with all codepoints < 256 are represented as they currently > are (one byte per char). There are no combining characters in the > first 256 codepoints anyway. > * Strings with all codepoints < 65536 and no combining characters, > ditto (two bytes per char). > * Strings with any combining characters in them are stored in four > bytes per char even if all codepoints are <65536. > * Any time a character consists of a single base with no combining, it > is stored in UTF-32. > * Combined characters are stored in the primary array as 0x80000000 > plus the index into a secondary array where these values are stored. > * The secondary array has a pointer for each combined character > (ignoring single-code-point characters), probably to a Python integer > object for simplicity. +1. We should totally do this just to troll the RUE! -- Greg From rantingrickjohnson at gmail.com Wed Jul 19 18:22:46 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 15:22:46 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596ed081$0$1610$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> <85zic15lvy.fsf@benfinney.id.au> <596ed081$0$1610$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Tuesday, July 18, 2017 at 10:24:54 PM UTC-5, Steve D'Aprano wrote: > On Wed, 19 Jul 2017 10:08 am, Ben Finney wrote: > > > Gregory Ewing writes: > > > > > The term "emoji" is becoming rather strained these days. > > > The idea of "woman" and "personal computer" being > > > emotions is an interesting one... > > > > I think of ?emoji? as ?not actually a character in any > > system anyone would use for writing anything, but somehow > > gets to squat in the Unicode space?. > > Blame the Japanese mobile phone manufacturers. They want to > include emoji in their SMSes and phone chat software, [...] > I suppose that having a standard for emoji is good. I'm not > convinced that Unicode should be that standard, but on the > other hand if we agree that Unicode should support > hieroglyphics and pictographs, well, that's exactly what > emoji are. Indeed. And here are some insightful lyrics by the great R. Waters (modified slightly) that you might consider: If you should go skating, on the thin ice of "modern-string life". Dragging behind you the giant repos, of the "million... code-point-strife". Don't be surprised when a crack in the ice, appears under your feet. You step of out your ASCII and out of your mind, with your pragmatism flowing out behind you, as you *CLAW* the thin ice! Yeah. It's a cautionary tale. From rantingrickjohnson at gmail.com Wed Jul 19 18:56:56 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 15:56:56 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: Message-ID: <87a90ed6-6b2b-4ec7-be42-8464efb1b0ab@googlegroups.com> On Tuesday, July 18, 2017 at 7:35:13 PM UTC-5, Mikhail V wrote: > ChrisA wrote: > >On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V wrote: > >> On 2017-07-18, Steve D'Aprano wrote: > > > > _Neither system is right or wrong, or better than the > > > > other._ > > > > > > If that is said just "not to hurt anybody" then its ok. > > > Though this statement is pretty absurd, not so many > > > (intelligent) people will buy this out today. > > > > Let me give you one concrete example: [...] > > Ok, in this narrow context I can also agree. But in > slightly wider context that phrase may sound almost like: > "neither geometrical shape is better than the other as a > basis for a wheel. If you have polygonal wheels, they are > still called wheels." All equilateral and equiangular polygons are approximations of the wheel (or the circle, to be more general). Of course, any "polygonal wheel" with a number of sides less than 6 would be very difficult to roll. 5 may be possible (to some degree). However, 4 and 3 would be more useful as snowplows than as "wheels". So the distinction between a wheel that is either an "N-sided polygon" or a "true circle" becomes more a matter of "levels of practicality" (both in usage _and_ manufacturing) than anything else. Of course -- and it goes without saying, but this being python-list i feel compelled to say it *wink* -- the perfect circle is the best wheel. From rantingrickjohnson at gmail.com Wed Jul 19 19:14:38 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 16:14:38 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596ed2cc$0$1589$c3e8da3$5496439d@news.astraweb.com> References: <596ed2cc$0$1589$c3e8da3$5496439d@news.astraweb.com> Message-ID: <7438d9ab-8bc8-4f50-9dad-94955065dbc7@googlegroups.com> On Tuesday, July 18, 2017 at 10:37:18 PM UTC-5, Steve D'Aprano wrote: > On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote: > > > Ok, in this narrow context I can also agree. > > But in slightly wider context that phrase may sound almost like: > > "neither geometrical shape is better than the other as a basis > > for a wheel. If you have polygonal wheels, they are still called wheels." > > I'm not talking about wheels, I'm talking about writing systems which are > fundamentally collections of arbitrary shapes. There's nothing about the sound > of "f" that looks like the letter "f". He was not talking about wheels either. He was making a rhetorical point as to the relationship between wheels (aka: perfect circles) and "approximations of wheels" (aka: equilateral and equiangular N-sided polygons). Here's a free tip: next time you're feeling confused by metaphors, but _before_ you reply, first do a "toupee check". If it's missing, then consider that the atmospheric disturbance created from a fast moving concept that buzzed your noggin may have flung it off. From larry.martell at gmail.com Wed Jul 19 19:26:50 2017 From: larry.martell at gmail.com (Larry Martell) Date: Wed, 19 Jul 2017 19:26:50 -0400 Subject: rpy2 In-Reply-To: References: Message-ID: On Wed, Jul 19, 2017 at 6:28 PM, Dennis Lee Bieber wrote: > On Wed, 19 Jul 2017 11:15:03 -0400, Larry Martell > declaimed the following: > >> >>/usr/bin/ld: cannot find -lr_utils >> >>But I could not find how to get libr_utils. >> > > Have you already built/installed R (maybe development packages too) -- > I suspect the library is part of the R utils package. > > https://stat.ethz.ch/R-manual/R-devel/library/utils/html/utils-package.html On RHEL 6 the version of R is 3.0, and rpy2 does not work with that version. I installed R 3.3.3, but I suspect I do not have the 3.3.3 devel and/or utils package. I don't know where to get them from. > OTOH -- perhaps it is looking for this third-party package > > https://github.com/HenrikBengtsson/R.utils I first thought that too, but that is entirely something else. From rantingrickjohnson at gmail.com Wed Jul 19 19:36:42 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 16:36:42 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <85bmonahhv.fsf@benfinney.id.au> <87a847jzsp.fsf@elektro.pacujo.net> <87zic7v3gl.fsf@elektro.pacujo.net> <87vamvv1q1.fsf@elektro.pacujo.net> <87pod3uvvv.fsf@elektro.pacujo.net> <87k23bustf.fsf@elektro.pacujo.net> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: On Wednesday, July 19, 2017 at 1:57:47 AM UTC-5, Steven D'Aprano wrote: > On Wed, 19 Jul 2017 17:51:49 +1200, Gregory Ewing wrote: > > > Chris Angelico wrote: > >> Once you NFC or NFD normalize both strings, identical strings will > >> generally have identical codepoints... You should then be able to use > >> normal regular expressions to match correctly. > > > > Except that if you want to match a set of characters, > > you can't reliably use [...], you would have to write them out as > > alternatives in case some of them take up more than one code point. > > Good point! > > A quibble -- there's no "in case" here, since you, the > programmer, will always know whether they have a single > code point form or not. If you're unsure, look it up, or > call unicodedata.normalize(). > > (Yeah, right, like the average coder will remember to do this...) > > Nevertheless, although it might be annoying and tricky, > regexes *are* flexible enough to deal with this problem. > After all, you can't use [th] to match "th" as a unit > either, and regex set character set notation [abcd] is > logically equivalent to (a|b|c|d). If the intention is to match the two-character-string "th", then the obvious solution would be to wrap the substring into a matching or non-matching group: pattern = r'(?:th)' Though i suppose one could abuse the character-set syntax by doing something like: pattern = r'[t][h]' However, even the first example (using a group) is superfluous if "th" is the only substring to be matched. Employing the power of grouping is only necessary in more complex patterns. From rantingrickjohnson at gmail.com Wed Jul 19 20:02:11 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 17:02:11 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <75019890-592b-1609-a11b-3ce4b6fab52a@kynesim.co.uk> <596f15b0$0$2878$c3e8da3$76491128@news.astraweb.com> <2ec7784b-7444-caf0-4006-135bb9427513@kynesim.co.uk> Message-ID: <4d0f12b7-4e88-4bbf-a835-1b0348c28925@googlegroups.com> On Wednesday, July 19, 2017 at 5:29:23 AM UTC-5, Rhodri James wrote: > when Acorn were developing their version of extended ASCII > in the late 80s, they asked three different University > lecturers in Welsh what extra characters they needed, and > got three different answers. And who would have guessed that the wishes of three random Welshian University lectures would become a microcosm into the future problems afflicting internationalization and localization of computer software. And perhaps one day there will be fantastical fables written about the "Three Wise Welshian Lecturers", who traveled across endless expanses of earth and sea, following at times strange lights in the sky, and bearing gifts of "me, myself, and i" upon the new king of encodings. From ben+python at benfinney.id.au Wed Jul 19 20:17:36 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 20 Jul 2017 10:17:36 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> <1500478037.1904126.1046003616.011E5CB9@webmail.messagingengine.com> Message-ID: <85lgnk55cv.fsf@benfinney.id.au> Random832 writes: > On Tue, Jul 18, 2017, at 19:21, Gregory Ewing wrote: > > Random832 wrote: > > > What about Emoji? > > > U+1F469 WOMAN is two columns wide on its own. > > > U+1F4BB PERSONAL COMPUTER is two columns wide on its own. > > Emoji comes from Japanese ??? - ?(E) picture, ??(moji) > character. Yes. Those (U+1F469 and U+1F4BB) are clearly pictures. The cell phone industry in East Asia insist that they are characters, loudly enough to get them into Unicode. I disagree strongly. -- \ ?I don't accept the currently fashionable assertion that any | `\ view is automatically as worthy of respect as any equal and | _o__) opposite view.? ?Douglas Adams | Ben Finney From steve+python at pearwood.info Wed Jul 19 21:30:43 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Thu, 20 Jul 2017 11:30:43 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <81174772-1df7-42ba-acdf-3da722d3f404@googlegroups.com> <596afb8b$0$11093$c3e8da3@news.astraweb.com> <1500388170.4044447.1044685648.6C7A4B8D@webmail.messagingengine.com> <596ec8b1$0$1596$c3e8da3$5496439d@news.astraweb.com> <1500478218.1904557.1046006840.7488E265@webmail.messagingengine.com> Message-ID: <597007c4$0$1603$c3e8da3$5496439d@news.astraweb.com> On Thu, 20 Jul 2017 01:30 am, Random832 wrote: > On Tue, Jul 18, 2017, at 22:49, Steve D'Aprano wrote: >> > What about Emoji? >> > U+1F469 WOMAN is two columns wide on its own. >> > U+1F4BB PERSONAL COMPUTER is two columns wide on its own. >> > U+200D ZERO WIDTH JOINER is zero columns wide on its own. >> >> >> What about them? In a monospaced font, they should follow the same rules >> I used >> above: either 0, 1 or 2 column wide. > > You snipped the important part - the fact that the whole sequence of > three code points U+1F469 U+200D U+1F4BB is a single grapheme cluster > two columns wide. There's no requirement for rendering engines to display the emoji sequence in any specific way. Maybe we would like the combined emoji to display in two columns, but that's not guaranteed, nor is it required by the standard. http://unicode.org/emoji/charts/emoji-zwj-sequences.html If the renderer cannot display a "Woman Personal Computer" as a single emoji, it is permissible to fall back to two glyphs. > You also ignored all of the other examples in my post. Did you even read > anything beyond what you snipped? Yes I did, but I didn't understand it. Maybe that was because I didn't read your post carefully enough, or maybe it was because you didn't explain what point you were making carefully enough. Or a little of both. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Wed Jul 19 21:34:23 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Thu, 20 Jul 2017 11:34:23 +1000 Subject: Grapheme clusters, a.k.a.real characters References: Message-ID: <597008a0$0$1603$c3e8da3$5496439d@news.astraweb.com> On Thu, 20 Jul 2017 04:34 am, Mikhail V wrote: > It is also pretty obvious that these Caps makes it harder to read in general. > (more obvious that excessive diacritics, like in French) No it isn't. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rantingrickjohnson at gmail.com Wed Jul 19 21:47:05 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 18:47:05 -0700 (PDT) Subject: Combining every pair of list items and creating a new list. In-Reply-To: References: <621ca9d5-79b1-44c9-b534-3ad1b0cf44a4@googlegroups.com> Message-ID: On Tuesday, July 18, 2017 at 5:19:56 AM UTC-5, Rahul K P wrote: > You can use a simple logic and list comprehension. > > so it will be like this > > lst = [1, 2, 3, 4, 5, 6, 7, 8] > print [lst[i:i+2] for i in range(0,len(lst),2)] No no no. Anybody can write code like that! To wow a professor and earn a high grade, the OP must prove a competence in mental gymnastics. Try this one liner: # BOILER PLATE >>> import sys >>> from operator import add, sub, mul >>> lst = [1, 2, 3, 4, 5, 6, 7, 8] # MEAT AND TATERS >>> sys.stdout.write(str(repr([lst[i:add(map(int, tuple([i]))[0], sub(2, 0))] for i in range(range(10)[0], mul(len(lst[:]), 1), sub(2, 0))]))) [[1, 2], [3, 4], [5, 6], [7, 8]] From rantingrickjohnson at gmail.com Wed Jul 19 22:09:26 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 19 Jul 2017 19:09:26 -0700 (PDT) Subject: Users of namedtuple: do you use the _source attribute? In-Reply-To: References: <596cec90$0$1605$c3e8da3$5496439d@news.astraweb.com> <3c56202e-2a86-4a03-ba77-8cbb755c8982@googlegroups.com> Message-ID: On Tuesday, July 18, 2017 at 12:59:36 AM UTC-5, Terry Reedy wrote: > Yes, No. The developers of the class agree that a trailing > underscore convention would have been better. 'source_' > etc. Which, while encroaching on the "this-is-a-reserved-symbol_" convention, would relieve the current "_stay-away-from- volatile-me" fear mongering. From steve+python at pearwood.info Wed Jul 19 22:12:51 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Thu, 20 Jul 2017 12:12:51 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> Message-ID: <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> On Thu, 20 Jul 2017 08:12 am, Gregory Ewing wrote: > Chris Angelico wrote: [snip overly complex and complicated string implementation] > +1. We should totally do this just to troll the RUE! You're an evil, wicked man, and I love it. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Wed Jul 19 22:40:08 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 20 Jul 2017 12:40:08 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <1500386999.4040650.1044671432.09E38853@webmail.messagingengine.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Thu, Jul 20, 2017 at 12:12 PM, Steve D'Aprano wrote: > On Thu, 20 Jul 2017 08:12 am, Gregory Ewing wrote: > >> Chris Angelico wrote: > [snip overly complex and complicated string implementation] > An accurate description, but in my own defense, I had misunderstood Marko's idea. Actually, the implementation I detailed was far SIMPLER than I thought it would be; I started writing that post trying to prove that it was impossible, but it turns out it isn't actually impossible. Just highly impractical. ChrisA From steve at pearwood.info Thu Jul 20 01:15:20 2017 From: steve at pearwood.info (Steven D'Aprano) Date: 20 Jul 2017 05:15:20 GMT Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> Message-ID: <59703c67$0$2878$c3e8da3$76491128@news.astraweb.com> On Thu, 20 Jul 2017 12:40:08 +1000, Chris Angelico wrote: > On Thu, Jul 20, 2017 at 12:12 PM, Steve D'Aprano > wrote: >> On Thu, 20 Jul 2017 08:12 am, Gregory Ewing wrote: >> >>> Chris Angelico wrote: >> [snip overly complex and complicated string implementation] >> >> > An accurate description, but in my own defense, I had misunderstood > Marko's idea. Actually, the implementation I detailed was far SIMPLER > than I thought it would be; I started writing that post trying to prove > that it was impossible, but it turns out it isn't actually impossible. > Just highly impractical. I haven't really been paying attention to Marko's suggestion in detail, but if we're talking about a whole new data type, how about a list of nodes, where each node's data is a decomposed string object guaranteed to be either: * an array of single-character code points; * or a single combining character sequence, emoji variation sequence, or other grapheme * with the appropriate length in "characters". So a string like "Hello World!" would be a single node: (12, "Hello World!") Strings will always be in decomposed form, so a "caf? au lait" would automatically use: U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT regardless of your editor, and represented by three nodes: (3, "caf") (1, "e\N{COMBINING ACUTE ACCENT}") (8, " au lait") Iterating the string would mean: for node in nodes: if node.count = 1: yield node.string else: yield from node.string Reversing would be: for node in nodes[::-1]: if node.count = 1: yield node.string else: yield from node.string[::-1] Getting the length in graphemes would mean: sum(node.count for node in nodes) Indexing and slicing I leave for an exercise. We lose O(1) indexing and slicing, but the length could be cached. Calculate the length on demand, then cache it for next time. (This assumes the string is immutable.) Indexing and slicing would be proportional to the number of nodes, not the length of the string. So not as bad as a naive UTF-8 implementation. Each substring could use the Flexible String Representation, to minimize the total memory. E.g. in the "caf? au lait" example above, the first and last node would use one byte per code point, and the middle node would use two bytes per code point. Of course, this doesn't *completely* solve the question of end user expectations. For example, many people would want "ij" to count as a single character, or "\r\n". And it would complicate the implementation of the various string methods and the regex engine. It will almost certainly be much slower than the str type, and use more memory, and it would be lossy with regards to certain patterns of code points. For example, it wouldn't distinguish between composed and decomposed strings, since they're always normalised to decomposed form. But it might be worth doing, for applications that care about giving a better user experience when it comes to editing text. -- Steve From bronger at physik.rwth-aachen.de Thu Jul 20 01:33:03 2017 From: bronger at physik.rwth-aachen.de (Torsten Bronger) Date: Thu, 20 Jul 2017 07:33:03 +0200 Subject: scandir slower than listdir Message-ID: <87d18vu0z4.fsf@wilson.bronger.org> Hall?chen! With a 24,000 files directory on an SSD running Ubuntu, #!/usr/bin/python3 import os, time start = time.time() list(os.listdir("/home/bronger/.saves")) print("listdir:", time.time() - start) start = time.time() list(os.scandir("/home/bronger/.saves")) print("scandir:", time.time() - start) yields listdir: 0.045470237731933594 scandir: 0.08043360710144043 However, scandir is supposed to be faster than listdir. Why do I see this? Tsch?, Torsten. -- Torsten Bronger From dieter at handshake.de Thu Jul 20 01:44:26 2017 From: dieter at handshake.de (dieter) Date: Thu, 20 Jul 2017 07:44:26 +0200 Subject: SIGSEGV and SIGILL inside PyCFunction_Call References: <20170719173528.4ecc3f78@wegge.dk> Message-ID: <87shhrmzlx.fsf@handshake.de> Anders Wegge Keller writes: > ... > I have an ongoing issue with my usenet setup. I'm that one dude who don't > want to learn perl. That means that I have to build inn from source, so I > can enable the python interpreter. That's not so bad, and the errors that > show up have been something that I have been able to figure out by myself. > At least up until now. I have an almost 100% repeatable crash, when nnrpd > performs the user authentication step. Backtracing the core dum gives this: > > #0 0x0000564a864e2d63 in ?? () > #1 0x00007f9609567091 in call_function (oparg=, > pp_stack=0x7ffda2d801b0) at ../Python/ceval.c:4352 > > Note: Line 4352 C_TRACE(x, PyCFunction_Call(func,callargs,NULL)); > > #2 PyEval_EvalFrameEx ( > f=Frame 0x7f9604758050, for file /etc/news/filter/nnrpd_auth.py, > line 67, in __init__ (self= description=None, rownumber=None, messages=[], _executed=None, > > ... > > Weird observation #1: Sometimes the reason is SIGSEGV, sometimes it's > SIGILL. Python tends to be sensitive to the stack size. In previous times, there have often be problems because the stack size for threads has not been large enough. Not sure, whether "nnrpd" is multi threaded and provides a sufficiently large stack for its threads. A "SIGILL" often occurs because a function call has destroyed part of the stack content and the return is erroneous (returning in the midst of an instruction). > ... > I'm not ready to give up yet, but I need some help proceeding from here. > What do the C_TRACE really do, The casing (all upper case letters) indicates a C preprocessor macro. Search the "*.h" files for its definition. I suppose that with a normal Python build (no debug build), the macro will just call "PyCFunction_Call". Alternatively, it might provide support for debugging, tracing (activated by e.g. "pdb.set_trace()"). > and is there some way of getting a level > deeper, to see what cause the SEGV. Also, how can the C code end up with an > illegal instruction_ A likely cause for both "SIGSEGV" and "SIGILL" could be stack corruption leading to a bad return or badly restored register values. I would look at the maschine instructions (i.e. look at the assembler rather than the C level) to find out precisely, which instruction caused the signal. Unfortunately, stack corruption is a non local problem (the point where the problem is caused is usually far away from the point where it is observed). If the problem is not "too small stack size", you might need a tool to analyse memory overrides. From tjreedy at udel.edu Thu Jul 20 02:32:30 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 20 Jul 2017 02:32:30 -0400 Subject: scandir slower than listdir In-Reply-To: <87d18vu0z4.fsf@wilson.bronger.org> References: <87d18vu0z4.fsf@wilson.bronger.org> Message-ID: On 7/20/2017 1:33 AM, Torsten Bronger wrote: > Hall?chen! > > With a 24,000 files directory on an SSD running Ubuntu, > > #!/usr/bin/python3 > > import os, time > > > start = time.time() > list(os.listdir("/home/bronger/.saves")) listdir returns a list of na > print("listdir:", time.time() - start) > > start = time.time() > list(os.scandir("/home/bronger/.saves")) scandir returns an iterator of DirEntry objects which contain more information than the mere name. > print("scandir:", time.time() - start) > > yields > > listdir: 0.045470237731933594 > scandir: 0.08043360710144043 So you are comparing apples and apple tarts. -- Terry Jan Reedy From steve+python at pearwood.info Thu Jul 20 07:43:02 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Thu, 20 Jul 2017 21:43:02 +1000 Subject: scandir slower than listdir References: <87d18vu0z4.fsf@wilson.bronger.org> Message-ID: <59709748$0$1601$c3e8da3$5496439d@news.astraweb.com> On Thu, 20 Jul 2017 03:33 pm, Torsten Bronger wrote: > Hall?chen! > > With a 24,000 files directory on an SSD running Ubuntu, > > #!/usr/bin/python3 > > import os, time > > > start = time.time() > list(os.listdir("/home/bronger/.saves")) > print("listdir:", time.time() - start) > > start = time.time() > list(os.scandir("/home/bronger/.saves")) > print("scandir:", time.time() - start) > > yields > > listdir: 0.045470237731933594 > scandir: 0.08043360710144043 > > However, scandir is supposed to be faster than listdir. Why do I > see this? The documentation says: "Using scandir() instead of listdir() can significantly increase the performance of code that ALSO NEEDS FILE TYPE OR FILE ATTRIBUTE INFORMATION" [emphasis added] https://docs.python.org/3.5/library/os.html#os.scandir If all you need is the names, listdir() is faster because it only returns the names. scandir() returns a data structure which may include cached values for: - the name - full path - flag whether it is a directory - flag whether it is a file - flag whether it is a symlink - inode number - file stat record -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From skip.montanaro at gmail.com Thu Jul 20 07:43:44 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Thu, 20 Jul 2017 06:43:44 -0500 Subject: scandir slower than listdir In-Reply-To: References: <87d18vu0z4.fsf@wilson.bronger.org> Message-ID: scandir returns an iterator of DirEntry objects which contain more > information than the mere name. > As I recall, the motivation for scandir was to avoid subsequent system calls, so it will be slower than listdir the way you've tested it. If you add in the cost of fetching the other bits Terry mentioned, I suspect your relative timing will change. Skip From wegge at wegge.dk Thu Jul 20 08:03:23 2017 From: wegge at wegge.dk (Anders Wegge Keller) Date: Thu, 20 Jul 2017 14:03:23 +0200 Subject: SIGSEGV and SIGILL inside PyCFunction_Call In-Reply-To: <87shhrmzlx.fsf@handshake.de> References: <20170719173528.4ecc3f78@wegge.dk> <87shhrmzlx.fsf@handshake.de> Message-ID: <20170720140323.34d1f000@wegge.dk> P? Thu, 20 Jul 2017 07:44:26 +0200 dieter skrev: > Anders Wegge Keller writes: ... >> Weird observation #1: Sometimes the reason is SIGSEGV, sometimes it's >> SIGILL. > Python tends to be sensitive to the stack size. In previous times, > there have often be problems because the stack size for threads > has not been large enough. Not sure, whether "nnrpd" is multi threaded > and provides a sufficiently large stack for its threads. Luckily, the "threading model" of nnrpd is fork(). > A "SIGILL" often occurs because a function call has destroyed part > of the stack content and the return is erroneous (returning in the midst > of an instruction). I think you're right. That also explains why gdb have trouble with the last stack frame. >> I'm not ready to give up yet, but I need some help proceeding from here. >> What do the C_TRACE really do, > The casing (all upper case letters) indicates a C preprocessor macro. > Search the "*.h" files for its definition. I know where it is. I just don't feel like deciphering a 60 lines monstrosity before at least asking if someone has a intimate enough relationship with it, to give a TL;DR. > I suppose that with a normal Python build (no debug build), the > macro will just call "PyCFunction_Call". > Alternatively, it might provide support for debugging, tracing > (activated by e.g. "pdb.set_trace()"). Probably. I can see I have to dig into it. >> and is there some way of getting a level >> deeper, to see what cause the SEGV. Also, how can the C code end up with >> an illegal instruction_ ... > Unfortunately, stack corruption is a non local problem (the point > where the problem is caused is usually far away from the point > where it is observed). > > If the problem is not "too small stack size", you might need > a tool to analyse memory overrides. The trouble with that is that nnrpd is a system daemon, and as such is a bit difficult to trace in place. That's why I am asking for help reasoning the cause, before I have to resort to running a debugger as a privileged user. -- //Wegge From rustompmody at gmail.com Thu Jul 20 11:18:20 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Thu, 20 Jul 2017 08:18:20 -0700 (PDT) Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <52c2a26e-e398-4a5a-83d8-828e556eba7f@googlegroups.com> References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <596ecc4d$0$1595$c3e8da3$5496439d@news.astraweb.com> <52c2a26e-e398-4a5a-83d8-828e556eba7f@googlegroups.com> Message-ID: On Thursday, July 20, 2017 at 3:21:52 AM UTC+5:30, Rick Johnson wrote: > On Tuesday, July 18, 2017 at 10:07:41 PM UTC-5, Steve D'Aprano wrote: > > On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote: > > [...] > > > > Einstein: If you can't explain something to a six-year- > > > old, you really don't understand it yourself. > > > > > > > [...] > > > > Think about it: it simply is nonsense. If this six year old > > test was valid, that would imply that all fields of > > knowledge are capable of being taught to the average six > > year old. Yeah good luck with that. > > Again, as was the case with your Toupee Fallacy a few days > ago, you've got it all wrong. The implication of that quote > was _not_ that six year olds are the "final arbiters of > truth". LOL. The implication is that explaining anything to > a six year old is not an easy task. Therefore, a teacher who > lacks a deep understanding of the subject matter could never > hope to properly educate a six year old student. Nicely stated Rick. [And amused to find myself on the same side!] In the hope that I am not also on the ?Ranting Rick? side here's some thoughts towards reducing the polemics For the purposes of this discussion, broadly speaking the knowledge "of computers" needs to be classified into 3 categories: 1. Recursive knowledge 2. Specialised pre-existing 3. Standard/Common (sense)/Universal 1. contains most of what would go into a typical CS degree - splay/Red-black/AVL trees - threads (vs processes) - OS, DBMS, compilers/interpreters - Big O analysis - algorithms, - etc - etc Algorithms (and all the rest) is ?recursive? in the sense that algorithms make computers happen/usable/etc; just as computers give meaning to the study of algorithms. Sure one can take inspiration from guys like Mike Fellows: ?Computer science is as much about computers as astronomy is about telescopes or biology about microscopes? And who does a lot of work towards teaching children algorithms without reference to computer technology [http://csunplugged.org/ ] But all this is very fringe. In practice 99% of people studying algorithms (and all the rest above) do it in the context of computer (science) 2. Most typical example would be mathematics which predates CS by some millennia 3. is all the zillion things needed to live in civilized society - Which side to drive the car on - how to read a clock - how to turn on the lights (but not stick your finger in the plug) So coming to the point: Its not whether Einstein or Mencken? is right but rather that Mencken applies to 1 whereas Einstein applies to 3 And (IMHO) text should be squarely classed in 3 not 1 The gmas of this world have made shopping lists, written (and taught to write) letters [my gpa wrote books] long before CS and before any of us existed. And if suddenly text has moved from being obvious to anyone to something arcane involving - codepoints (which are abstract and platonic) - (?) glyphs - (that fit into) octets (whatever that may be except they are not bytes) - And all other manner of Unicode-gobbledygook Something somewhere is wrong ? The Mencken quote is as much off as the Einstein one: https://en.wikiquote.org/wiki/H._L._Mencken From random832 at fastmail.com Thu Jul 20 12:10:14 2017 From: random832 at fastmail.com (Random832) Date: Thu, 20 Jul 2017 12:10:14 -0400 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <59703c67$0$2878$c3e8da3$76491128@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <59703c67$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: <1500567014.3241785.1047263104.10ADD0D3@webmail.messagingengine.com> On Thu, Jul 20, 2017, at 01:15, Steven D'Aprano wrote: > I haven't really been paying attention to Marko's suggestion in detail, > but if we're talking about a whole new data type, how about a list of > nodes, where each node's data is a decomposed string object guaranteed to > be either: How about each node but the last has a fixed "length" (say, 16 characters), and random access below that size is done by indexing to the node level and then walking forward. I've thought about this in the past for encoding strings in UTF-8 with O(1) random code point access. From rhodri at kynesim.co.uk Thu Jul 20 12:46:46 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 20 Jul 2017 17:46:46 +0100 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <596ecc4d$0$1595$c3e8da3$5496439d@news.astraweb.com> <52c2a26e-e398-4a5a-83d8-828e556eba7f@googlegroups.com> Message-ID: On 20/07/17 16:18, Rustom Mody wrote: > So coming to the point: > Its not whether Einstein or Mencken? is right but rather that Mencken applies to > 1 whereas Einstein applies to 3 > > And (IMHO) text should be squarely classed in 3 not 1 > > The gmas of this world have made shopping lists, written (and taught to write) > letters [my gpa wrote books] long before CS and before any of us existed. > > And if suddenly text has moved from being obvious to anyone to something arcane > involving > - codepoints (which are abstract and platonic) > - (?) glyphs > - (that fit into) octets (whatever that may be except they are not bytes) > - And all other manner of Unicode-gobbledygook > Something somewhere is wrong The something that is wrong is a failure to consider the necessary _depth_ of knowledge. The shallow (read: obvious and intuitive) definition of text works just fine in the context of grandma's shopping list or granddad's book, localised environments with heavily circumscribed usage patterns. It breaks down in the global environments we've been talking about in much the same way that the obvious and intuitive definition of numbers breaks down when you start considering infinities, or Newtonian mechanics breaks down near the speed of light, or pretty much everything intuitive breaks down at quantum scales. -- Rhodri James *-* Kynesim Ltd From rosuav at gmail.com Thu Jul 20 13:59:16 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 03:59:16 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: References: <87vamshl3u.fsf@elektro.pacujo.net> <80c56f32-afef-4091-9ec5-35573c45e9a7@googlegroups.com> <10ef3350-1ace-4239-bf71-54f4ef481491@googlegroups.com> <596c0bd9$0$1616$c3e8da3$5496439d@news.astraweb.com> <98121b10-c562-4a0b-8966-6196236d89c4@googlegroups.com> <46df1ab4-0303-47f4-962a-cc25a50b3a41@googlegroups.com> <596ecc4d$0$1595$c3e8da3$5496439d@news.astraweb.com> <52c2a26e-e398-4a5a-83d8-828e556eba7f@googlegroups.com> Message-ID: On Fri, Jul 21, 2017 at 2:46 AM, Rhodri James wrote: > On 20/07/17 16:18, Rustom Mody wrote: >> >> So coming to the point: >> Its not whether Einstein or Mencken? is right but rather that Mencken >> applies to >> 1 whereas Einstein applies to 3 >> >> And (IMHO) text should be squarely classed in 3 not 1 >> >> The gmas of this world have made shopping lists, written (and taught to >> write) >> letters [my gpa wrote books] long before CS and before any of us existed. >> >> And if suddenly text has moved from being obvious to anyone to something >> arcane >> involving >> - codepoints (which are abstract and platonic) >> - (?) glyphs >> - (that fit into) octets (whatever that may be except they are not bytes) >> - And all other manner of Unicode-gobbledygook >> Something somewhere is wrong > > > The something that is wrong is a failure to consider the necessary _depth_ > of knowledge. The shallow (read: obvious and intuitive) definition of text > works just fine in the context of grandma's shopping list or granddad's > book, localised environments with heavily circumscribed usage patterns. It > breaks down in the global environments we've been talking about in much the > same way that the obvious and intuitive definition of numbers breaks down > when you start considering infinities, or Newtonian mechanics breaks down > near the speed of light, or pretty much everything intuitive breaks down at > quantum scales. ALL of the problems in this thread can be explained to a cat. https://xkcd.com/722/ I wouldn't ask the cat's opinion on the definition of a character, though. ChrisA From rosuav at gmail.com Thu Jul 20 14:02:09 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 04:02:09 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <1500567014.3241785.1047263104.10ADD0D3@webmail.messagingengine.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <59703c67$0$2878$c3e8da3$76491128@news.astraweb.com> <1500567014.3241785.1047263104.10ADD0D3@webmail.messagingengine.com> Message-ID: On Fri, Jul 21, 2017 at 2:10 AM, Random832 wrote: > On Thu, Jul 20, 2017, at 01:15, Steven D'Aprano wrote: >> I haven't really been paying attention to Marko's suggestion in detail, >> but if we're talking about a whole new data type, how about a list of >> nodes, where each node's data is a decomposed string object guaranteed to >> be either: > > How about each node but the last has a fixed "length" (say, 16 > characters), and random access below that size is done by indexing to > the node level and then walking forward. > > I've thought about this in the past for encoding strings in UTF-8 with > O(1) random code point access. You would have to benchmark it thoroughly. Don't forget that allocating a large string would now require a number of memory allocations (which are slow), and that cache locality is a huge source of hidden performance variation. Big O is far from the whole story. "But it's happening in constant time!" is meaningless if the constant is too high. ChrisA From marko at pacujo.net Thu Jul 20 14:05:05 2017 From: marko at pacujo.net (Marko Rauhamaa) Date: Thu, 20 Jul 2017 21:05:05 +0300 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87vamnc7ce.fsf@elektro.pacujo.net> Chris Angelico : > Actually, the implementation I detailed was far SIMPLER than I thought > it would be; I started writing that post trying to prove that it was > impossible, but it turns out it isn't actually impossible. Just highly > impractical. The existing str implementation could be tweaked to accommodate the "super code points" I proposed: Add a pointer field to CPython's UCS-4 string variant. Behind the pointer is an array of 64-bit pointers. If any string code point is 1114112 or greater, subtract 1114112 from it to get an index into the pointer array. If the pointer at the index is odd, cast it into uint64_t and shift right by one bit to get the super code point. Such a packed super code point can hold 3 full code points (3 * 21 bits). If the pointer at the index is an even number, it is a reference to a bigint value representing the super code point. Marko From hongzeliu at berkeley.edu Thu Jul 20 15:19:26 2017 From: hongzeliu at berkeley.edu (Hongze Liu) Date: Thu, 20 Jul 2017 12:19:26 -0700 Subject: Problem Message-ID: Hello Python, I encountered this problem: File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "ok\__main__.py", line 46, in File "ok\client\cli\ok.py", line 201, in main File "ok\client\protocols\rate_limit.py", line 41, in run File "ok\client\utils\storage.py", line 28, in get File "ok\client\utils\storage.py", line 18, in contains File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\shelve.py", line 243, in open return DbfilenameShelf(filename, flag, protocol, writeback) File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\shelve.py", line 227, in __init__ Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback) File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\__init__.py", line 94, in open return mod.open(file, flag, mode) File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", line 324, in open return _Database(file, mode, flag=flag) File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", line 71, in __init__ self._update() File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", line 106, in _update key, pos_and_siz_pair = _ast.literal_eval(line) File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\ast.py", line 48, in literal_eval node_or_string = parse(node_or_string, mode='eval') File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\ast.py", line 35, in parse return compile(source, filename, mode, PyCF_ONLY_AST) ValueError: source code string cannot contain null bytes I attempted to reinstall python and used "Repair" option, however, the issue still exists. Thanks From ikorot01 at gmail.com Thu Jul 20 15:35:45 2017 From: ikorot01 at gmail.com (Igor Korot) Date: Thu, 20 Jul 2017 15:35:45 -0400 Subject: Problem In-Reply-To: References: Message-ID: Hi, Can you post some code? Thank you. On Thu, Jul 20, 2017 at 3:19 PM, Hongze Liu wrote: > Hello Python, > > I encountered this problem: > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\runpy.py", > line 193, in _run_module_as_main > "__main__", mod_spec) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\runpy.py", > line 85, in _run_code > exec(code, run_globals) > File "ok\__main__.py", line 46, in > File "ok\client\cli\ok.py", line 201, in main > File "ok\client\protocols\rate_limit.py", line 41, in run > File "ok\client\utils\storage.py", line 28, in get > File "ok\client\utils\storage.py", line 18, in contains > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\shelve.py", > line 243, in open > return DbfilenameShelf(filename, flag, protocol, writeback) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\shelve.py", > line 227, in __init__ > Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\__init__.py", > line 94, in open > return mod.open(file, flag, mode) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", > line 324, in open > return _Database(file, mode, flag=flag) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", > line 71, in __init__ > self._update() > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", > line 106, in _update > key, pos_and_siz_pair = _ast.literal_eval(line) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\ast.py", > line 48, in literal_eval > node_or_string = parse(node_or_string, mode='eval') > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\ast.py", > line 35, in parse > return compile(source, filename, mode, PyCF_ONLY_AST) > ValueError: source code string cannot contain null bytes > > > I attempted to reinstall python and used "Repair" option, however, the > issue still exists. > > Thanks > -- > https://mail.python.org/mailman/listinfo/python-list From lists at andros.org.uk Thu Jul 20 15:44:40 2017 From: lists at andros.org.uk (Andrew McLean) Date: Thu, 20 Jul 2017 20:44:40 +0100 Subject: Nesting concurrent.futures.ThreadPoolExecutor Message-ID: <86e6267e-e3e7-15b4-60aa-7ff010727606@andros.org.uk> I have a program where I am currently using a concurrent.futures.ThreadPoolExecutor to run multiple tasks concurrently. These tasks are typically I/O bound, involving access to local databases and remote REST APIs. However, these tasks could themselves be split into subtasks, which would also benefit from concurrency. What I am hoping is that it is safe to use a concurrent.futures.ThreadPoolExecutor within the tasks. I have coded up a toy example, which seems to work. However, I'd like some confidence that this is intentional. Concurrency is notoriously tricky. I very much hope this is safe, because otherwise it would not be safe to use a ThreadPoolExecutor to execute arbitrary code, in case it also used concurrent.futures to exploit concurrency. Here is the toy example: > |importconcurrent.futures definner(i,j):returni,j,i**j > defouter(i):withconcurrent.futures.ThreadPoolExecutor(max_workers=5)asexecutor:futures > ={executor.submit(inner,i,j):j forj inrange(5)}results =[]forfuture > inconcurrent.futures.as_completed(futures):results.append(future.result())returnresults > defmain():withconcurrent.futures.ThreadPoolExecutor(max_workers=5)asexecutor:futures > ={executor.submit(outer,i):i fori inrange(10)}results =[]forfuture > inconcurrent.futures.as_completed(futures):results.extend(future.result())print(results)if__name__ > =="__main__":main()| I have previously posted this on Stack Overflow, but didn't get any replies. Apologies if you are seeing this twice. https://stackoverflow.com/questions/44989473/nesting-concurrent-futures-threadpoolexecutor From tjreedy at udel.edu Thu Jul 20 16:32:29 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 20 Jul 2017 16:32:29 -0400 Subject: Problem In-Reply-To: References: Message-ID: On 7/20/2017 3:19 PM, Hongze Liu wrote: > Hello Python, > > I encountered this problem: > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\runpy.py", > line 193, in _run_module_as_main > "__main__", mod_spec) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\runpy.py", > line 85, in _run_code > exec(code, run_globals) > File "ok\__main__.py", line 46, in > File "ok\client\cli\ok.py", line 201, in main > File "ok\client\protocols\rate_limit.py", line 41, in run > File "ok\client\utils\storage.py", line 28, in get > File "ok\client\utils\storage.py", line 18, in contains > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\shelve.py", > line 243, in open > return DbfilenameShelf(filename, flag, protocol, writeback) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\shelve.py", > line 227, in __init__ > Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\__init__.py", > line 94, in open > return mod.open(file, flag, mode) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", > line 324, in open > return _Database(file, mode, flag=flag) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", > line 71, in __init__ > self._update() > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\dbm\dumb.py", > line 106, in _update > key, pos_and_siz_pair = _ast.literal_eval(line) > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\ast.py", > line 48, in literal_eval > node_or_string = parse(node_or_string, mode='eval') > File "C:\Users\Hongze\AppData\Local\Programs\python\Python36\lib\ast.py", > line 35, in parse > return compile(source, filename, mode, PyCF_ONLY_AST) > ValueError: source code string cannot contain null bytes > > > I attempted to reinstall python and used "Repair" option, however, the > issue still exists. Because the problem is with your code. Somewhere in one of the files in your ok package passes a string or bytes containing \x00 to something that results in an attempt to compile the string. Start with storage.py, contains(), line 18 and see what is being passed. If needed work back up until you find the source of what is being passed. -- Terry Jan Reedy From rosuav at gmail.com Thu Jul 20 16:39:49 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 06:39:49 +1000 Subject: Problem In-Reply-To: References: Message-ID: On Fri, Jul 21, 2017 at 6:32 AM, Terry Reedy wrote: > Because the problem is with your code. Somewhere in one of the files in > your ok package passes a string or bytes containing \x00 to something that > results in an attempt to compile the string. Start with storage.py, > contains(), line 18 and see what is being passed. If needed work back up > until you find the source of what is being passed. > shelve is involved, and it seems to be trying to exec something. It could be a corrupted state file? ChrisA From rgaddi at highlandtechnology.invalid Thu Jul 20 18:43:51 2017 From: rgaddi at highlandtechnology.invalid (Rob Gaddi) Date: Thu, 20 Jul 2017 15:43:51 -0700 Subject: Nesting concurrent.futures.ThreadPoolExecutor In-Reply-To: References: <86e6267e-e3e7-15b4-60aa-7ff010727606@andros.org.uk> Message-ID: On 07/20/2017 12:44 PM, Andrew McLean wrote: > I have a program where I am currently using a > concurrent.futures.ThreadPoolExecutor to run multiple tasks > concurrently. These tasks are typically I/O bound, involving access to > local databases and remote REST APIs. However, these tasks could > themselves be split into subtasks, which would also benefit from > concurrency. > > What I am hoping is that it is safe to use a > concurrent.futures.ThreadPoolExecutor within the tasks. I have coded up > a toy example, which seems to work. However, I'd like some confidence > that this is intentional. Concurrency is notoriously tricky. > > I very much hope this is safe, because otherwise it would not be safe to > use a ThreadPoolExecutor to execute arbitrary code, in case it also used > concurrent.futures to exploit concurrency. > Well that last statement is clearly false. It's not safe to use any multiple access mechanism (threading, processes, async stuff) to execute arbitrary code; so it's by definition not safe to use a ThreadPoolExecutor. I'm not being cute and using "safe" in some Turing sense, I'm talking specifically about multiple accesses. Whenever you have multiple access you have interlock issues, which you resolve with mutexes or message queues or however you decide to do so. Those issues don't go away when you use the ThreadPoolExecutor. There is every possibility, especially if you start recursively spawning threads, that A spawns B, A blocks on something that B is supposed to do (such as completing a Future), but due to the thread limit of the pool, the mere existence of A is preventing B from being executed, and you have a deadlock. -- Rob Gaddi, Highland Technology -- www.highlandtechnology.com Email address domain is currently out of order. See above to fix. From guido at python.org Thu Jul 20 19:30:25 2017 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Jul 2017 16:30:25 -0700 Subject: Your feedback on our free Advanced Python tutorial In-Reply-To: References: Message-ID: Hi Aude, Unfortunately I don't have time to review or endorse your Python materials. Good luck, --Guido On Wed, Jul 19, 2017 at 9:44 AM, Aude Barral, CodinGame wrote: > Hi everyone, > > I am co-founder of a startup called CodinGame. > > A few days ago we've launched a project: Tech.io . It's > a > free knowledge-sharing platform that allows tech professionals to learn new > programming concepts through hands-on content crafted by volunteers in the > community. > > Everything runs on our backend. Our system relies on Docker images so we > can play tutorials and demos of virtually any technology from the browser. > > So why this project? Because as more and more resources over the Internet > now need to be paid for (Udacity, Udemy, etc), we want to foster free > online technology education thanks to peer learning. In a sense, we'd like > to become some kind of Wikipedia for tech. > > One of the first tutorials our contributors published is about Advanced > Python Features (hope more will be published soon!): https://tech.io/play > grounds/500/advanced-python-features/advanced-python-features > > I have 2 questions for you: > - Do you think this tutorial could be helpful to your Python user > community? > - Would you be willing to help us spread the word about Tech.io? > > Thanks a lot for checking, > > Cheers! > Aude > > > *Aude BARRAL*, Co-Founder > +33 674 632 708 > > [image: https://fr.linkedin.com/in/audebarral] > > > > > -- > https://mail.python.org/mailman/listinfo/python-announce-list > > Support the Python Software Foundation: > http://www.python.org/psf/donations/ > -- --Guido van Rossum (python.org/~guido) From timothy.c.delaney at gmail.com Thu Jul 20 20:06:51 2017 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 21 Jul 2017 10:06:51 +1000 Subject: scandir slower than listdir In-Reply-To: References: <87d18vu0z4.fsf@wilson.bronger.org> Message-ID: On 20 July 2017 at 21:43, Skip Montanaro wrote: > scandir returns an iterator of DirEntry objects which contain more > > information than the mere name. > > > > As I recall, the motivation for scandir was to avoid subsequent system > calls, so it will be slower than listdir the way you've tested it. If you > add in the cost of fetching the other bits Terry mentioned, I suspect your > relative timing will change. > In addition, listdir() returns a list of names, so building a new list from that is fairly fast (can use a single allocation of the correct size). scandir() returns an iterator, so building a list from that may require multiple reallocations (depending on the number of entries in the directory), which could skew the test results. In neither case is building a list from the result the way you would normally use it. A more accurate test of the way both functions would normally be used would be to iterate over the results instead of eagerly building a list. In this test you would also expect scandir() to use less memory for a large directory. Tim Delaney From noah at neo.co.tz Thu Jul 20 23:07:11 2017 From: noah at neo.co.tz (Noah) Date: Fri, 21 Jul 2017 06:07:11 +0300 Subject: Your feedback on our free Advanced Python tutorial In-Reply-To: References: Message-ID: On 20 Jul 2017 3:03 p.m., "Aude Barral, CodinGame" wrote: Hi everyone, I am co-founder of a startup called CodinGame. A few days ago we've launched a project: Tech.io . It's a free knowledge-sharing platform that allows tech professionals to learn new programming concepts through hands-on content crafted by volunteers in the community. Everything runs on our backend. Our system relies on Docker images so we can play tutorials and demos of virtually any technology from the browser. So why this project? Because as more and more resources over the Internet now need to be paid for (Udacity, Udemy, etc), we want to foster free online technology education thanks to peer learning. In a sense, we'd like to become some kind of Wikipedia for tech. One of the first tutorials our contributors published is about Advanced Python Features (hope more will be published soon!): https://tech.io/play grounds/500/advanced-python-features/advanced-python-features Perfect and thanks for taking your time folks to put this together out there for free. I have 2 questions for you: - Do you think this tutorial could be helpful to your Python user community? 100% Yes - Would you be willing to help us spread the word about Tech.io? Yes yes yes... Thanks a lot for checking, Cheers! Aude Cheers, Noah ----- Evolve or Extinct. Enable IPv6 now? From steve+python at pearwood.info Thu Jul 20 23:17:46 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 21 Jul 2017 13:17:46 +1000 Subject: Problem References: Message-ID: <5971725b$0$1607$c3e8da3$5496439d@news.astraweb.com> On Fri, 21 Jul 2017 05:19 am, Hongze Liu wrote: > Hello Python, > > I encountered this problem: Can you tell us *how* you encountered this problem? How were you invoking Python. It makes a BIG difference if you are invoking it by just launching the Python interactive interpreter with no extra code involved: python -E -S or if you are running your own code. It does sound like your shelve database file has been corrupted. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From steve+python at pearwood.info Thu Jul 20 23:20:16 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 21 Jul 2017 13:20:16 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <87vamnc7ce.fsf@elektro.pacujo.net> Message-ID: <597172f1$0$1607$c3e8da3$5496439d@news.astraweb.com> On Fri, 21 Jul 2017 04:05 am, Marko Rauhamaa wrote: > If any string code point is 1114112 or greater By definition, no Unicode code point can ever have an ordinal value greater than 0x10FFFF = 1114111. So I don't know what you're talking about, but it isn't Unicode. If you want to invent your own text standard to compete with Unicode, well, good luck getting people behind it. Especially since it seems to offer nothing but a vast increase in complexity and memory usage for no apparent benefit that I can see. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Thu Jul 20 23:43:39 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 13:43:39 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <597172f1$0$1607$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <596f00e9$0$2878$c3e8da3$76491128@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <87vamnc7ce.fsf@elektro.pacujo.net> <597172f1$0$1607$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 21, 2017 at 1:20 PM, Steve D'Aprano wrote: > On Fri, 21 Jul 2017 04:05 am, Marko Rauhamaa wrote: > >> If any string code point is 1114112 or greater > > By definition, no Unicode code point can ever have an ordinal value greater than > 0x10FFFF = 1114111. > > So I don't know what you're talking about, but it isn't Unicode. If you want to > invent your own text standard to compete with Unicode, well, good luck getting > people behind it. Especially since it seems to offer nothing but a vast > increase in complexity and memory usage for no apparent benefit that I can see. He's talking specifically about an in-memory representation. The term "code point" is inaccurate; I think "code unit" is more accurate, but normally a code unit represents *at most* one code point, and potentially less (UTF-16 and astral chars) - this is using it to represent *more* than one code point. So maybe it needs a different word. Nonetheless, this is a reasonably internally-consistent way to represent textual data. It's a three-tier system: * The string is stored as a series of 32-bit integers, where each integer is either a UTF-32 code unit, or 1114112+n for some value n. * A secondary array of 64-bit integers stores either odd integers consisting of three 21-bit numbers packed together and then the low bit set as a flag, or even integers representing pointers. * The heap is tertiary memory for those combined characters consisting of more than three code points (base character plus more than two combining characters). And while this is extremely complicated, it does at least push most of the complexity to the unusual cases. A pure ASCII string is still able to be represented exactly the same way Python 3.3+ does; all it needs is a spare patch of memory representing an empty array of integers. (Smarter people than I may be able to store this array in zero bytes without getting things confused. I'm not sure.) Strings with all code points on the BMP and no combining characters are still able to be represented as they are today, again with the empty secondary array. The presence of a single combining character in the string does force it to be stored 32 bits per character, so there can be a price to pay. Similarly, the secondary array will only VERY rarely need to contain any pointers; most combined characters consist of a base and one combining, or a set of three characters at most. There'll be dramatic performance costs for strings where piles of combining characters get loaded on top of a single base, but at least they can be accurately represented. However, there's still one major MAJOR problem. The semantics of string handling depend on having a proper table of Unicode character types (or at least a listing of every combining character). As the Unicode standard is augmented, the number of combining characters can increase, and I don't think the Consortium has pre-allocated space saying "this is where combining characters will be placed". So what should be done if there's an unknown code point in a string? Should an exception be raised? Should it assume that it's a base character? Either way, you have versioning problems, and big ones. As such, this could be extremely useful as a tool for textual analysis, but it should NOT, in my opinion, be the internal representation of a string. ChrisA From dieter at handshake.de Fri Jul 21 02:16:44 2017 From: dieter at handshake.de (dieter) Date: Fri, 21 Jul 2017 08:16:44 +0200 Subject: SIGSEGV and SIGILL inside PyCFunction_Call References: <20170719173528.4ecc3f78@wegge.dk> <87shhrmzlx.fsf@handshake.de> <20170720140323.34d1f000@wegge.dk> Message-ID: <87eftauxf7.fsf@handshake.de> Anders Wegge Keller writes: > ... > The trouble with that is that nnrpd is a system daemon, and as such is a > bit difficult to trace in place. That's why I am asking for help Often, you can run a (typical) daemon in non-daemon mode -- specifically to debug problems (a usual requirement during development). I also often used "attach" to attach a debugger to a running daemon. Of course, I see the difficulties to attach to a freshly forked daemon child which crashed quitely after being made active. From steve+python at pearwood.info Fri Jul 21 02:34:24 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 21 Jul 2017 16:34:24 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <87vamnc7ce.fsf@elektro.pacujo.net> <597172f1$0$1607$c3e8da3$5496439d@news.astraweb.com> Message-ID: <5971a073$0$1603$c3e8da3$5496439d@news.astraweb.com> On Fri, 21 Jul 2017 01:43 pm, Chris Angelico wrote: > Strings with all code > points on the BMP and no combining characters are still able to be > represented as they are today, again with the empty secondary array. I presume that since the problem we're trying to solve here is that certain characters have two representations, this format will automatically decompose strings. Otherwise, it doesn't really solve the problems with diacritics, where a single human-readable character like ? or ? has two distinct, and non-equal, representations. But if it does, then every string with a diacritic (i.e. most Western European text, if not Eastern European as well) will need combining characters. If this *doesn't* decompose the strings, then what problem is it actually solving? > The presence of a single combining character in the string does force > it to be stored 32 bits per character, so there can be a price to pay. Right -- so it's really compact for Americans, and blows out for just about everyone else. > Similarly, the secondary array will only VERY rarely need to contain > any pointers; most combined characters consist of a base and one > combining, or a set of three characters at most. I don't know if you can make that claim for non-West European languages. I don't know enough about (for example) Slavic languages, or Thai, or Arabic, or Chinese, to know whether (base + three combining characters) will be rare or not. But emoji sequences will often require four code points, three of which will be in the supplementary planes. http://unicode.org/emoji/charts/emoji-zwj-sequences.html > There'll be dramatic > performance costs for strings where piles of combining characters get > loaded on top of a single base, but at least they can be accurately > represented. They can be accurately represented right now. E.g. there is nothing ambiguous or inaccurate about U+1F469 U+1F3FD U+200D U+1F52C, "woman scientist with medium skin tone". -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Fri Jul 21 04:05:18 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 18:05:18 +1000 Subject: Grapheme clusters, a.k.a.real characters In-Reply-To: <5971a073$0$1603$c3e8da3$5496439d@news.astraweb.com> References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87379sbvn9.fsf@elektro.pacujo.net> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <87vamnc7ce.fsf@elektro.pacujo.net> <597172f1$0$1607$c3e8da3$5496439d@news.astraweb.com> <5971a073$0$1603$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 21, 2017 at 4:34 PM, Steve D'Aprano wrote: > On Fri, 21 Jul 2017 01:43 pm, Chris Angelico wrote: > >> Strings with all code >> points on the BMP and no combining characters are still able to be >> represented as they are today, again with the empty secondary array. > > I presume that since the problem we're trying to solve here is that certain > characters have two representations, this format will automatically decompose > strings. Otherwise, it doesn't really solve the problems with diacritics, where > a single human-readable character like ? or ? has two distinct, and non-equal, > representations. > > But if it does, then every string with a diacritic (i.e. most Western European > text, if not Eastern European as well) will need combining characters. > > If this *doesn't* decompose the strings, then what problem is it actually > solving? I'm honestly not sure, though I had been assuming that it was capable of representing composed OR decomposed strings. If it does decompose everything, then yeah, a lot more will need the secondary array. >> Similarly, the secondary array will only VERY rarely need to contain >> any pointers; most combined characters consist of a base and one >> combining, or a set of three characters at most. > > I don't know if you can make that claim for non-West European languages. I don't > know enough about (for example) Slavic languages, or Thai, or Arabic, or > Chinese, to know whether (base + three combining characters) will be rare or > not. Not sure, but what I usually see is that one Chinese character gets one Unicode codepoint. But again, forcible decomposition may change this. > But emoji sequences will often require four code points, three of which will be > in the supplementary planes. > > http://unicode.org/emoji/charts/emoji-zwj-sequences.html "Often"? I doubt that; a lot of emoji don't require that many. >> There'll be dramatic >> performance costs for strings where piles of combining characters get >> loaded on top of a single base, but at least they can be accurately >> represented. > > They can be accurately represented right now. E.g. there is nothing ambiguous or > inaccurate about U+1F469 U+1F3FD U+200D U+1F52C, "woman scientist with medium > skin tone". I may have elided a bit too much here. Let's start with a simpler representation: a string is represented as a tuple of Python integer objects, each of which uses the original scheme. Now, that's able to represent everything, but it's stupidly expensive. The original multi-tiered scheme gives vast improvements for everything other than this case, but at least it doesn't make them unrepresentable (cf UCS-2). ChrisA From antoon.pardon at rece.vub.ac.be Fri Jul 21 04:32:36 2017 From: antoon.pardon at rece.vub.ac.be (Antoon Pardon) Date: Fri, 21 Jul 2017 10:32:36 +0200 Subject: Not cathing nested StopIteration Message-ID: <263a0d26-212d-bfe0-7862-58b24082a2ba@rece.vub.ac.be> This is python 3.4 on a debian box In the code below, line 36 raises a StopIteration, I would have thought that this exception would be caught by line 39 but instead I get a traceback. Did I miss something or is this a bug? This is the code: try: # 21 filename = os.path.join(lang_path, lang) # 22 fl = open(filename) # 23 except FileNotFoundError: # 24 try: # 25 lst = lang.split('_') # 26 prefix = lst[0] + '*' # 27 try: # 28 lst[1] = lst[0].upper() # 29 except IndexError: # 30 lst.append(lst[0].upper()) # 31 lang = '_'.join(lst) # 32 filename = os.path.join(lang_path, lang) # 33 fl = open(filename) # 34 except FileNotFoundError: # 35 lang = next(iglob(os.path.join(lang_path, prefix))) # 36 filename = os.path.join(lang_path, lang) # 37 fl = open(filename) # 38 except StopIteration: # 39 fl = () # 40 This is the traceback: Traceback (most recent call last): File "/home/antoon/src/projecten/richter/translate.py", line 23, in use_lang fl = open(filename) FileNotFoundError: [Errno 2] No such file or directory: '/home/antoon/src/projecten/richter/locus/nl' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/antoon/src/projecten/richter/translate.py", line 34, in use_lang fl = open(filename) FileNotFoundError: [Errno 2] No such file or directory: '/home/antoon/src/projecten/richter/locus/nl_NL' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "lang_test", line 4, in use_lang("Nederlands") File "/home/antoon/src/projecten/richter/translate.py", line 36, in use_lang lang = next(iglob(os.path.join(lang_path, prefix))) StopIteration -- Antoon. From rosuav at gmail.com Fri Jul 21 04:48:12 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 21 Jul 2017 18:48:12 +1000 Subject: Not cathing nested StopIteration In-Reply-To: <263a0d26-212d-bfe0-7862-58b24082a2ba@rece.vub.ac.be> References: <263a0d26-212d-bfe0-7862-58b24082a2ba@rece.vub.ac.be> Message-ID: On Fri, Jul 21, 2017 at 6:32 PM, Antoon Pardon wrote: > This is python 3.4 on a debian box > > In the code below, line 36 raises a StopIteration, I would have > thought that this exception would be caught by line 39 but instead > I get a traceback. > > Did I miss something or is this a bug? > > This is the code: (trimmed to highlight execution flow) > try: # 21 > fl = open(filename) # 23 > except FileNotFoundError: # 24 > try: # 25 > fl = open(filename) # 34 > except FileNotFoundError: # 35 > lang = next(iglob(os.path.join(lang_path, prefix))) # 36 > except StopIteration: # 39 > fl = () # 40 You go into the first "except FileNotFoundError", and then StopIteration comes from there. The except block starting on line 39 is actually guarding the try on line 21, and won't catch exceptions that crop up during the handling of line 24's except block. I would recommend moving the "except StopIteration" *inside* the other except block, probably with its own try (so it can only catch exceptions from the next() call). Or, possibly even better: next() can take a second argument, which is returned if the iterator is exhausted. So you could replace this: lang = next(iglob(os.path.join(lang_path, prefix))) # 36 filename = os.path.join(lang_path, lang) # 37 fl = open(filename) # 38 with this: lang = next(iglob(os.path.join(lang_path, prefix)), ()) fl = fl or open(os.path.join(lang_path, lang)) If no language file is found, next() will return the empty tuple. Then the next line will act only if a non-empty value is returned. ChrisA From __peter__ at web.de Fri Jul 21 04:53:37 2017 From: __peter__ at web.de (Peter Otten) Date: Fri, 21 Jul 2017 10:53:37 +0200 Subject: Not cathing nested StopIteration References: <263a0d26-212d-bfe0-7862-58b24082a2ba@rece.vub.ac.be> Message-ID: Antoon Pardon wrote: > This is python 3.4 on a debian box > > In the code below, line 36 raises a StopIteration, I would have > thought that this exception would be caught by line 39 but instead > I get a traceback. > > Did I miss something or is this a bug? Your code structure is try: raise FileNotFoundError except FileNotFoundError: # try ... except FileNotFoundError inside this handler # omitted as it is only a distraction raise StopIteration except StopIteration: ... This will only catch StopIteration-s in the first try block (lines 22 and 23). To handle exceptions in the first except you need another level of try...except, either try: try: raise FileNotFoundError: except FileNotFoundError: raise StopIteration except StopIteration: ... or try: raise FileNotFoundError: except FileNotFoundError: try: raise StopIteration except StopIteration: ... else: ... > > This is the code: > > try: # 21 > filename = os.path.join(lang_path, lang) # 22 > fl = open(filename) # 23 > except FileNotFoundError: # 24 > try: # 25 > lst = lang.split('_') # 26 > prefix = lst[0] + '*' # 27 > try: # 28 > lst[1] = lst[0].upper() # 29 > except IndexError: # 30 > lst.append(lst[0].upper()) # 31 > lang = '_'.join(lst) # 32 > filename = os.path.join(lang_path, lang) # 33 > fl = open(filename) # 34 > except FileNotFoundError: # 35 > lang = next(iglob(os.path.join(lang_path, prefix))) # 36 > filename = os.path.join(lang_path, lang) # 37 > fl = open(filename) # 38 > except StopIteration: # 39 > fl = () # 40 > > This is the traceback: > > Traceback (most recent call last): > File "/home/antoon/src/projecten/richter/translate.py", line 23, in > use_lang > fl = open(filename) > FileNotFoundError: [Errno 2] No such file or directory: > '/home/antoon/src/projecten/richter/locus/nl' > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "/home/antoon/src/projecten/richter/translate.py", line 34, in > use_lang > fl = open(filename) > FileNotFoundError: [Errno 2] No such file or directory: > '/home/antoon/src/projecten/richter/locus/nl_NL' > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "lang_test", line 4, in > use_lang("Nederlands") > File "/home/antoon/src/projecten/richter/translate.py", line 36, in > use_lang > lang = next(iglob(os.path.join(lang_path, prefix))) > StopIteration > From antoon.pardon at rece.vub.ac.be Fri Jul 21 06:01:40 2017 From: antoon.pardon at rece.vub.ac.be (Antoon Pardon) Date: Fri, 21 Jul 2017 12:01:40 +0200 Subject: Not cathing nested StopIteration In-Reply-To: References: <263a0d26-212d-bfe0-7862-58b24082a2ba@rece.vub.ac.be> Message-ID: <6253027b-8370-f882-124a-73c2765e3bf9@rece.vub.ac.be> Thanks Peter and Chris for helping to resolve the knot in my brain. From steve+python at pearwood.info Fri Jul 21 06:34:05 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 21 Jul 2017 20:34:05 +1000 Subject: Grapheme clusters, a.k.a.real characters References: <596811e0$0$1618$c3e8da3$5496439d@news.astraweb.com> <87y3rkaaky.fsf@elektro.pacujo.net> <87shhsa6gg.fsf@elektro.pacujo.net> <597011a4$0$1593$c3e8da3$5496439d@news.astraweb.com> <87vamnc7ce.fsf@elektro.pacujo.net> <597172f1$0$1607$c3e8da3$5496439d@news.astraweb.com> <5971a073$0$1603$c3e8da3$5496439d@news.astraweb.com> Message-ID: <5971d89f$0$1619$c3e8da3$5496439d@news.astraweb.com> On Fri, 21 Jul 2017 06:05 pm, Chris Angelico wrote: >> But emoji sequences will often require four code points, three of which will >> be in the supplementary planes. >> >> http://unicode.org/emoji/charts/emoji-zwj-sequences.html > > "Often"? I doubt that; a lot of emoji don't require that many. Oh come now. When given the possibility of an emoji that says: "smiling man" versus one that says: "medium-light skinned left-handed man with blond hair and grey eyes who is a Capricorn and likes Chinese food and playing soccer, wearing a brown trilby hat and riding a blue motorcycle and using a computer, smiling slightly ironically with a big grin and a wink" you *know* the kids today will prefer the second. :-P -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From nimbiotics at gmail.com Fri Jul 21 08:15:04 2017 From: nimbiotics at gmail.com (Mario R. Osorio) Date: Fri, 21 Jul 2017 05:15:04 -0700 (PDT) Subject: Your feedback on our free Advanced Python tutorial In-Reply-To: References: Message-ID: It would be nice if you made it more 'readable' the light gray foreground color of the text makes it very uncomfortable to read, at least to me. Take a look at: HOW THE WEB BECAME UNREADABLE https://www.wired.com/2016/10/how-the-web-became-unreadable/ From brian at brianlcase.com Fri Jul 21 10:24:16 2017 From: brian at brianlcase.com (Brian Case) Date: Fri, 21 Jul 2017 09:24:16 -0500 Subject: Where is python and idle? In-Reply-To: References: Message-ID: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> I am running windows 10 version 1703 as administrator on a Dell Inspiron 15 laptop. I downloaded and installed python 3.6.2 from https://www.python.org/downloads/ for windows. https://www.programiz.com/python-programming instructs me to open IDLE once that install completed. But I find NEITHER Python nor IDLE anywhere on my machine. I reran the install which gave me options for REPAIR. I ran it, which completed successfully and provided the python-list email address. I still cannot find an executable or a folder for anything beginning with Python. Where should I look besides folders C:\Program Files and C:\Program Files (x86)? Regards, Brian Case From nathan.ernst at gmail.com Fri Jul 21 11:19:41 2017 From: nathan.ernst at gmail.com (Nathan Ernst) Date: Fri, 21 Jul 2017 10:19:41 -0500 Subject: Where is python and idle? In-Reply-To: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> References: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> Message-ID: Check your user folder. For me, on my PC, python is installed at C:\Users\nernst\AppData\Local\Programs\Python Regards, Nate On Fri, Jul 21, 2017 at 9:24 AM, Brian Case wrote: > I am running windows 10 version 1703 as administrator on a Dell Inspiron > 15 laptop. > > I downloaded and installed python 3.6.2 from > https://www.python.org/downloads/ for windows. > > https://www.programiz.com/python-programming instructs me to open IDLE > once that install completed. > > But I find NEITHER Python nor IDLE anywhere on my machine. > > I reran the install which gave me options for REPAIR. I ran it, which > completed successfully and provided the python-list email address. > > I still cannot find an executable or a folder for anything beginning with > Python. > > Where should I look besides folders C:\Program Files and C:\Program Files > (x86)? > > Regards, > > Brian Case > -- > https://mail.python.org/mailman/listinfo/python-list > From ikorot01 at gmail.com Fri Jul 21 11:28:40 2017 From: ikorot01 at gmail.com (Igor Korot) Date: Fri, 21 Jul 2017 11:28:40 -0400 Subject: Where is python and idle? In-Reply-To: References: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> Message-ID: Hi, On Fri, Jul 21, 2017 at 11:19 AM, Nathan Ernst wrote: > Check your user folder. For me, on my PC, python is installed > at C:\Users\nernst\AppData\Local\Programs\Python I don't know about python, but usually a good Windows installer ask for the place to install and give some default path there. Thank you. P.S. Also I'd check Windows menu - it better have a shortcuts for opening IDLE. > > Regards, > Nate > > On Fri, Jul 21, 2017 at 9:24 AM, Brian Case wrote: > >> I am running windows 10 version 1703 as administrator on a Dell Inspiron >> 15 laptop. >> >> I downloaded and installed python 3.6.2 from >> https://www.python.org/downloads/ for windows. >> >> https://www.programiz.com/python-programming instructs me to open IDLE >> once that install completed. >> >> But I find NEITHER Python nor IDLE anywhere on my machine. >> >> I reran the install which gave me options for REPAIR. I ran it, which >> completed successfully and provided the python-list email address. >> >> I still cannot find an executable or a folder for anything beginning with >> Python. >> >> Where should I look besides folders C:\Program Files and C:\Program Files >> (x86)? >> >> Regards, >> >> Brian Case >> -- >> https://mail.python.org/mailman/listinfo/python-list >> > -- > https://mail.python.org/mailman/listinfo/python-list From brian at brianlcase.com Fri Jul 21 12:20:12 2017 From: brian at brianlcase.com (Brian Case) Date: Fri, 21 Jul 2017 11:20:12 -0500 Subject: Where is python and idle? In-Reply-To: References: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> Message-ID: Thank you, That is where it is. Would not have found it without your help. Now, to find IDLE. rgrds, Brian On 7/21/2017 10:19 AM, Nathan Ernst wrote: > Check your user folder. For me, on my PC, python is installed > at C:\Users\nernst\AppData\Local\Programs\Python > > Regards, > Nate > > On Fri, Jul 21, 2017 at 9:24 AM, Brian Case > wrote: > > I am running windows 10 version 1703 as administrator on a Dell > Inspiron 15 laptop. > > I downloaded and installed python 3.6.2 from > https://www.python.org/downloads/ > for windows. > > https://www.programiz.com/python-programming > instructs me to > open IDLE once that install completed. > > But I find NEITHER Python nor IDLE anywhere on my machine. > > I reran the install which gave me options for REPAIR. I ran it, > which completed successfully and provided the python-list email > address. > > I still cannot find an executable or a folder for anything > beginning with Python. > > Where should I look besides folders C:\Program Files and > C:\Program Files (x86)? > > Regards, > > Brian Case > -- > https://mail.python.org/mailman/listinfo/python-list > > > From rosuav at gmail.com Fri Jul 21 13:07:48 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 22 Jul 2017 03:07:48 +1000 Subject: Where is python and idle? In-Reply-To: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> References: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> Message-ID: On Sat, Jul 22, 2017 at 12:24 AM, Brian Case wrote: > I am running windows 10 version 1703 as administrator on a Dell Inspiron 15 > laptop. > > I downloaded and installed python 3.6.2 from > https://www.python.org/downloads/ for windows. > > https://www.programiz.com/python-programming instructs me to open IDLE once > that install completed. > > But I find NEITHER Python nor IDLE anywhere on my machine. > > I reran the install which gave me options for REPAIR. I ran it, which > completed successfully and provided the python-list email address. > > I still cannot find an executable or a folder for anything beginning with > Python. > > Where should I look besides folders C:\Program Files and C:\Program Files > (x86)? Apologies for what's probably a dumb question, but is it in your Start menu? I don't know Win 10 very well - I used Windows plenty back in the XP days, but now I'm all Linux, apart from a bit of testing and VMs and stuff. Normally, after Python's installed, I'd go looking for it in the Start menu, in a folder called "Python X.Y" (in your case, "Python 3.6"). ChrisA From haregot21 at gmail.com Fri Jul 21 13:14:03 2017 From: haregot21 at gmail.com (Tafla Magnaw) Date: Fri, 21 Jul 2017 10:14:03 -0700 (PDT) Subject: Creating an oval inside a pyplot in Tkinter window Message-ID: <249db08f-8e69-4bdc-bd4b-0677d8917a29@googlegroups.com> I spent about a week solving the problem I have in this code but I couldn't do anything. I just give up working due to that line of code. What I need is that: I have a main window with an "oval menu bar" and under the menu-bar, there is a " Add Oval" drop-down menu. when I click the "Add oval" drop-down menu, a pop up window with entry boxes will display.Enter the x & y location (for example 50 &150 and click the send button, the oval should display inside of the left side pyplot. The code below works fine with tkinter window but not working with in the pyplot. How can I create the oval inside the left side of the pyplot? I spent long time to figure out that but I couldn't do anything.I really need any help please.The only problem with the code below is on the App(tk.Tk) class.I really need any help please!! Thank you Here is the code I tried: from tkinter import * from tkinter import Tk, Frame, Entry, Button, Listbox,StringVar,Label,Menu,filedialog, END from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg import tkinter as tk from matplotlib.figure import Figure import matplotlib.gridspec as gridspec root = tk.Tk() RADIUS = 10 root.title("Oval window") fig = Figure(figsize=(10,10)) gs = gridspec.GridSpec(1, 2) ax1 = fig.add_subplot(gs[0, 0]) ax1.set_xticks([]) ax1.set_xticklabels([]) ax1.set_yticks([]) ax1.set_yticklabels gs = gridspec.GridSpec(2, 2) ax2 = fig.add_subplot(gs[0, 1]) ax2.set_xticks([]) ax2.set_xticklabels([]) ax2.set_yticks([]) ax2.set_yticklabels gs = gridspec.GridSpec(2, 2) ax3 = fig.add_subplot(gs[1, 1]) ax3.set_xticks([]) ax3.set_xticklabels([]) ax3.set_yticks([]) ax3.set_yticklabels canvas = FigureCanvasTkAgg(fig, master=root) canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1) canvas.get_tk_widget().pack() canvas.draw() fig.set_tight_layout({'rect': [0, 0, 0.996, 0.998], 'pad': 0.2, 'h_pad': 0.2}) fig.set_facecolor( 'blue' ) def Remove_entries(): for field in fields: field.Remove(0,END) def enter_values(): nw = Toplevel() nw.title(" Removing Oval") Remove_button = Button(nw, text=" Remove", command=Remove_entries) Remove_button.pack() Button1 = Button(nw, text=" Cancel",command=nw.destroy) Button1.pack() TheListBox = Listbox(nw) # Add a list entry widget TheListBox.pack() for item in ["oval1", "oval2", "oval3", "oval4","SS5","oval6....","ovaln"]: # add four items to the listbox from the main window TheListBox.insert(END, item) ################# class Config9(tk.Toplevel): def __init__(self, master=None, **kwargs): tk.Toplevel.__init__(self, master, **kwargs) btn = tk.Button(self, text='Send',fg="blue", command=self.add) btn.grid(row=8, column=0, sticky=tk.W) btn = tk.Button(self, text='Cancel',fg="red", command=self.destroy) btn.grid(row=8, column=10, sticky=tk.E) lb1 = tk.Label(self, text="Location in (x,y)") lb1.grid(row=2, column=0, columnspan=1, sticky=tk.EW) self.e1 = tk.Entry(self) self.e1.grid(row=2, column=1, columnspan=20, sticky=tk.E) self.e1.focus_set() lbl2 = tk.Label(self, text=" Type") lbl2.grid(row=0, column=0, columnspan=1, sticky=tk.W) self.e2 = tk.Entry(self) self.e2.grid(row=0, column=1, columnspan=20, sticky=tk.E) self.e2.focus_set() lbl3 = tk.Label(self, text="SSN") lbl3.grid(row=1, column=0, columnspan=1, sticky=tk.W) self.e3 = tk.Entry(self) self.e3.grid(row=1, column=1, columnspan=20, sticky=tk.E) self.e3.focus_set() lbl4 = tk.Label(self, text="Oreintation") lbl4.grid(row=3, column=0, columnspan=1, sticky=tk.W) self.e4 = tk.Entry(self) self.e4.grid(row=3, column=1, columnspan=20, sticky=tk.E) self.e4.focus_set() lbl5 = tk.Label(self, text="Constant") lbl5.grid(row=4, column=0, columnspan=1, sticky=tk.W) self.e5 = tk.Entry(self) self.e5.grid(row=4, column=1, columnspan=20, sticky=tk.E) self.e5.focus_set() self.transient(master) # set to be on top of the main window self.grab_set() master.wait_window(self) def add(self): x, y = self.e1.get().split() self.master.place_dot(int(x), int(y)) class App(tk.Tk): def __init__(self): #### Hello, here is the problem!!!! How can I change this window? btn = tk.Button(self, text='Add Oval ',fg="blue", command=self.get_options) btn.grid(row=0, column=0, sticky=tk.W) btn = tk.Button(self, text='Cancel',fg="red", command=self.destroy) btn.grid(row=0, column=1, sticky=tk.E) self.can = tk.Canvas(self, width=400, height=400) self.can.grid(row=1, column=0, columnspan=2) def get_options(self): Config9(self) def place_dot(self, x, y): x1 = x - RADIUS y1 = y - RADIUS x2 = x + RADIUS y2 = y + RADIUS self.can.create_oval(x1, y1, x2, y2,fill='green') def callback9(self,button): if button==" btn": print ("btn") ### menubar = Menu(root) ovalmenu = Menu(menubar) ovalmenu.add_command(label="Add oval", command=Config9) ovalmenu.add_command(label="Remove oval", command=enter_values) menubar.add_cascade(label=" Oval", menu=ovalmenu) root.config(menu=menubar) root.mainloop() win = App() win.mainloop() From skip.montanaro at gmail.com Fri Jul 21 14:52:39 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Fri, 21 Jul 2017 13:52:39 -0500 Subject: JSON encoding PDF or Excel files in Python 2.7 Message-ID: I would like to JSON encode some PDF and Excel files. I can read the content: pdf = open("somefile.pdf", "rb").read() but now what? json.dumps() insists on treating it as a string to be interpreted as utf-8, and bytes == str in Python 2.x. I can't json.dumps() a bytearray. I can pickle the raw content and json.dumps that, but I can't guarantee the listener at the other end will be written in Python. Am I going to have to do something like base64-encode the raw bytes to transmit them? Thx, Skip From irmen.NOSPAM at xs4all.nl Fri Jul 21 15:30:13 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Fri, 21 Jul 2017 21:30:13 +0200 Subject: JSON encoding PDF or Excel files in Python 2.7 In-Reply-To: References: Message-ID: <59725645$0$722$e4fe514c@news.xs4all.nl> On 21/07/2017 20:52, Skip Montanaro wrote: > I would like to JSON encode some PDF and Excel files. I can read the content: > > pdf = open("somefile.pdf", "rb").read() > > but now what? json.dumps() insists on treating it as a string to be > interpreted as utf-8, and bytes == str in Python 2.x. I can't > json.dumps() a bytearray. I can pickle the raw content and json.dumps > that, but I can't guarantee the listener at the other end will be > written in Python. Am I going to have to do something like > base64-encode the raw bytes to transmit them? > > Thx, > > Skip > Yes, json is a text based format and can't contain arbitrary binary data. So you'll have to encode the bytes into some textual form first. If you think base-64 is too verbose you might try base-85 instead which is slightly more efficient (available since python 3.4 in the base64 module)? Irmen From python at mrabarnett.plus.com Fri Jul 21 15:45:29 2017 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 21 Jul 2017 20:45:29 +0100 Subject: JSON encoding PDF or Excel files in Python 2.7 In-Reply-To: References: Message-ID: On 2017-07-21 19:52, Skip Montanaro wrote: > I would like to JSON encode some PDF and Excel files. I can read the content: > > pdf = open("somefile.pdf", "rb").read() > > but now what? json.dumps() insists on treating it as a string to be > interpreted as utf-8, and bytes == str in Python 2.x. I can't > json.dumps() a bytearray. I can pickle the raw content and json.dumps > that, but I can't guarantee the listener at the other end will be > written in Python. Am I going to have to do something like > base64-encode the raw bytes to transmit them? > JSON supports floats, ints, (Unicode) strings, lists and dicts (with string keys). It doesn't support bytestrings (raw bytes). Yes, you're going to have to 'encode' it somehow into one of the available types. From skip.montanaro at gmail.com Fri Jul 21 16:44:27 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Fri, 21 Jul 2017 15:44:27 -0500 Subject: JSON encoding PDF or Excel files in Python 2.7 In-Reply-To: References: Message-ID: > JSON supports floats, ints, (Unicode) strings, lists and dicts (with string > keys). It doesn't support bytestrings (raw bytes). Thanks, MRAB and Irmen. It looks like bson does what I need. Skip From sonnichs at gmail.com Fri Jul 21 16:48:00 2017 From: sonnichs at gmail.com (FS) Date: Fri, 21 Jul 2017 13:48:00 -0700 (PDT) Subject: pyserial and end-of-line specification In-Reply-To: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> References: <6f8d76c1-d6dd-4f4b-87b4-e299449a1d25@googlegroups.com> Message-ID: Thanks Rob. Yes I ended up with a read(1) and use a field count and a few other checks to make sure I don't get a partial record. Serial is the "best of times and worst of times". Sure beats dealing with USB enumeration, power hungry ethernet processors and a lot of other stuff. I can still "see" serial on my o'scope which is always nice and I don't see it going away any time soon at least in the laboratory. Python has been a bit of a chore-seems like a lot of verison/rev inconsistencies. At any rate I am going to stick with it. I used PERL in the past but I covet the stats packages and a few other things I lost when I left matlab and I want to try for chemometrics work. cheers fritz From vinayak.choubey at avekshaa.com Sat Jul 22 01:30:20 2017 From: vinayak.choubey at avekshaa.com (Vinayak Choubey) Date: Fri, 21 Jul 2017 22:30:20 -0700 (PDT) Subject: How to perform classification and clustering on a dataset which having numerical as well as text value as a feature presents Message-ID: <9181b05c-9335-43f5-a1ea-2db0d5e294c5@googlegroups.com> Hi All, I start learning python and enjoying very much.Right now i am trying to learn and use all machine learning things using python libraries.I stuck to a clustering and classification scenario.As i am a newbie in this,Seeking help with python code.I am explaining below: I am having dataset in .csv format.Data set contains numeric as well as textual value.sample dataset is as follows: id Server C_name Priority Information Policy_name 28 192.168.0.35 cpu% Critical cpu%:Critical on 192.168.0.35 ServerAlert What i want actually, to cluster my data sets on the basis of Server and priority column value and then classify it according to C_name. How can i do that please tell me.I am waiting for help.Thanking you in advance. From smith at smith.it Sat Jul 22 03:47:59 2017 From: smith at smith.it (Smith) Date: Sat, 22 Jul 2017 09:47:59 +0200 Subject: Compare files excel Message-ID: Hello to all, I should compare two excel files with pandas. Who can help me? Do you have any links? i tried this, but not working import pandas as pd df1 = pd.read_excel('excel1.xlsx') df2 = pd.read_excel('excel2.xlsx') difference = df1[df1!=df2] print (difference) Thank you From amirouche.boubekki at gmail.com Sat Jul 22 05:01:20 2017 From: amirouche.boubekki at gmail.com (Amirouche Boubekki) Date: Sat, 22 Jul 2017 09:01:20 +0000 Subject: How do you do deep input data validation? Message-ID: It seems that aiohttp recommends using one of trafaret, colander or jsonschema . I am not very familiar with those libraries. I am wondering what's the rational behind this choice. Why is trafaret the first in this list? Is it a strong recommendation to use trafaret? I tried to ask a question on SO about deep validation . Basically, how do you validate data in your project? What are the best pratices? TIA! From walters.justin01 at gmail.com Sat Jul 22 11:54:28 2017 From: walters.justin01 at gmail.com (justin walters) Date: Sat, 22 Jul 2017 08:54:28 -0700 Subject: How do you do deep input data validation? In-Reply-To: References: Message-ID: On Sat, Jul 22, 2017 at 2:01 AM, Amirouche Boubekki < amirouche.boubekki at gmail.com> wrote: > It seems that aiohttp recommends using one of trafaret, colander or > jsonschema > >. > I am not very familiar with those libraries. > > I am wondering what's the rational behind this choice. Why is trafaret the > first in this list? Is it a strong recommendation to use trafaret? > > I tried to ask a question on SO about deep validation > validate-application-logic-using-pyramids-colander>. > Basically, how do you validate data in your project? What are the best > pratices? > > TIA! > -- > https://mail.python.org/mailman/listinfo/python-list > Personally, I like to use Marshmallow: http://marshmallow.readthedocs.io/en/latest/ From devilsgrin94 at gmail.com Sat Jul 22 15:39:58 2017 From: devilsgrin94 at gmail.com (Rahul Sircar) Date: Sun, 23 Jul 2017 01:09:58 +0530 Subject: No subject Message-ID: I wrote my code for downloading a file 'Metasploitable' using urllib2.But it seems to have entered infinite loop.Because the screen is blank.It just hangs there.Please have a look at my code. import urllib2 file = 'metasploitable-linux-2.0.0.zip' url=' https://downloads.sourceforge.net/project/metasploitable/Metasploitable2/metasploitable-linux-2.0.0.zip ' response = urllib2.urlopen(url) fh=open(file,'w') fh.write(response.read()) fh.close() From __peter__ at web.de Sat Jul 22 16:14:58 2017 From: __peter__ at web.de (Peter Otten) Date: Sat, 22 Jul 2017 22:14:58 +0200 Subject: Downloading a file with Python 2 References: Message-ID: Rahul Sircar wrote: > I wrote my code for downloading a file 'Metasploitable' using urllib2.But > it seems to have entered infinite loop.Because the screen is blank.It just > hangs there. It "hangs", i. e. doesn't give any feedback while the data is retrieved. > Please have a look at my code. > > import urllib2 > file = 'metasploitable-linux-2.0.0.zip' > url=' > https://downloads.sourceforge.net/project/metasploitable/Metasploitable2/ > metasploitable-linux-2.0.0.zip > ' > response = urllib2.urlopen(url) > fh=open(file,'w') > fh.write(response.read()) The line above reads the complete file into memory before it even writes the first byte. That's usually fine if the file is small, a few KB, say, but in this case (833MB) it's better to read smaller chunks, like in CHUNKSIZE = 2 ** 12 total = 0 with open(file, "wb") as f: while True: chunk = response.read(CHUNKSIZE) if not chunk: break f.write(chunk) total += len(chunk) print total, "bytes written" Another option is to use urllib.urlretrieve which also allows to give some feedback while downloading: import sys import urllib file = 'metasploitable-linux-2.0.0.zip' url = ( "https://downloads.sourceforge.net/project" "/metasploitable/Metasploitable2/metasploitable-linux-2.0.0.zip" ) def progress(block, blocksize, filesize): sys.stdout.write("\r%s of %s" % (block * blocksize, filesize)) sys.stdout.flush() urllib.urlretrieve(url, filename=file, reporthook=progress) From sjeik_appie at hotmail.com Sat Jul 22 16:21:27 2017 From: sjeik_appie at hotmail.com (Albert-Jan Roskam) Date: Sat, 22 Jul 2017 20:21:27 +0000 Subject: Compare files excel In-Reply-To: References: Message-ID: (sorry for top posting) Try: df1['difference'] = (df1 == df2).all(axis=1) ________________________________ From: Python-list on behalf of Smith Sent: Saturday, July 22, 2017 7:47:59 AM To: python-list at python.org Subject: Compare files excel Hello to all, I should compare two excel files with pandas. Who can help me? Do you have any links? i tried this, but not working import pandas as pd df1 = pd.read_excel('excel1.xlsx') df2 = pd.read_excel('excel2.xlsx') difference = df1[df1!=df2] print (difference) Thank you -- https://mail.python.org/mailman/listinfo/python-list From smith at smith.it Sun Jul 23 04:52:40 2017 From: smith at smith.it (Smith) Date: Sun, 23 Jul 2017 10:52:40 +0200 Subject: Compare files excel References: Message-ID: <19119b87-9e7e-e59b-89b6-f9b253a8e078@smith.it> On 22/07/2017 22:21, Albert-Jan Roskam wrote: > (sorry for top posting) > > Try: > df1['difference'] = (df1 == df2).all(axis=1) > ________________________________ here below there is the mistake : In [17]: diff = df1['difference'] = (df1 == df2).all(axis=1) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 diff = df1['difference'] = (df1 == df2).all(axis=1) /usr/local/lib/python3.5/dist-packages/pandas/core/ops.py in f(self, other) 1295 def f(self, other): 1296 if isinstance(other, pd.DataFrame): # Another DataFrame -> 1297 return self._compare_frame(other, func, str_rep) 1298 elif isinstance(other, ABCSeries): 1299 return self._combine_series_infer(other, func) /usr/local/lib/python3.5/dist-packages/pandas/core/frame.py in _compare_frame(self, other, func, str_rep) 3570 def _compare_frame(self, other, func, str_rep): 3571 if not self._indexed_same(other): -> 3572 raise ValueError('Can only compare identically-labeled ' 3573 'DataFrame objects') 3574 return self._compare_frame_evaluate(other, func, str_rep) ValueError: Can only compare identically-labeled DataFrame objects From smith at smith.it Sun Jul 23 04:55:09 2017 From: smith at smith.it (Smith) Date: Sun, 23 Jul 2017 10:55:09 +0200 Subject: Compare files excel References: Message-ID: On 22/07/2017 22:21, Albert-Jan Roskam wrote: > df1['difference'] = (df1 == df2).all(axis=1) below here there is the mistake : In [17]: diff = df1['difference'] = (df1 == df2).all(axis=1) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 diff = df1['difference'] = (df1 == df2).all(axis=1) /usr/local/lib/python3.5/dist-packages/pandas/core/ops.py in f(self, other) 1295 def f(self, other): 1296 if isinstance(other, pd.DataFrame): # Another DataFrame -> 1297 return self._compare_frame(other, func, str_rep) 1298 elif isinstance(other, ABCSeries): 1299 return self._combine_series_infer(other, func) /usr/local/lib/python3.5/dist-packages/pandas/core/frame.py in _compare_frame(self, other, func, str_rep) 3570 def _compare_frame(self, other, func, str_rep): 3571 if not self._indexed_same(other): -> 3572 raise ValueError('Can only compare identically-labeled ' 3573 'DataFrame objects') 3574 return self._compare_frame_evaluate(other, func, str_rep) ValueError: Can only compare identically-labeled DataFrame objects From __peter__ at web.de Sun Jul 23 07:21:23 2017 From: __peter__ at web.de (Peter Otten) Date: Sun, 23 Jul 2017 13:21:23 +0200 Subject: Compare files excel References: Message-ID: Smith wrote: > On 22/07/2017 22:21, Albert-Jan Roskam wrote: >> df1['difference'] = (df1 == df2).all(axis=1) > > below here there is the mistake : > > In [17]: diff = df1['difference'] = (df1 == df2).all(axis=1) > --------------------------------------------------------------------------- > ValueError Traceback (most recent call > last) in () > ----> 1 diff = df1['difference'] = (df1 == df2).all(axis=1) > > /usr/local/lib/python3.5/dist-packages/pandas/core/ops.py in f(self, > other) > 1295 def f(self, other): > 1296 if isinstance(other, pd.DataFrame): # Another DataFrame > -> 1297 return self._compare_frame(other, func, str_rep) > 1298 elif isinstance(other, ABCSeries): > 1299 return self._combine_series_infer(other, func) > > /usr/local/lib/python3.5/dist-packages/pandas/core/frame.py in > _compare_frame(self, other, func, str_rep) > 3570 def _compare_frame(self, other, func, str_rep): > 3571 if not self._indexed_same(other): > -> 3572 raise ValueError('Can only compare identically-labeled > ' > 3573 'DataFrame objects') > 3574 return self._compare_frame_evaluate(other, func, str_rep) > > ValueError: Can only compare identically-labeled DataFrame objects The columns of both dataframes must be identical. Compare: >>> import pandas as pd >>> a = pd.DataFrame([[1,2],[3,4]], columns=["a", "b"]) >>> b = pd.DataFrame([[1,2],[3,5]], columns=["a", "c"]) With different column names: >>> a != b Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3/dist-packages/pandas/core/ops.py", line 875, in f return self._compare_frame(other, func, str_rep) File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 2860, in _compare_frame raise ValueError('Can only compare identically-labeled ' ValueError: Can only compare identically-labeled DataFrame objects Again, with identical column names: >>> b = pd.DataFrame([[1,2],[3,5]], columns=["a", "b"]) >>> a != b a b 0 False False 1 False True From ganesh1pal at gmail.com Sun Jul 23 13:21:33 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Sun, 23 Jul 2017 22:51:33 +0530 Subject: Tips to match multiple patterns from from a single file . Message-ID: I have hundreds of file in a directory from all of which I need to extract multiple values namely filename with pathname (which start with test*), 1,1,25296896:8192 ( only the one containing pattern corrupting), before corruption( it?s a hex value), offset(digit), size(digit) Sample file contents ( All my files are small files ): 07/22/2017 12:34:28 AM INFO: --offset=18 --mirror=1 --path=/ifs/i/inode.txt --size=4 07/22/2017 12:34:28 AM INFO:The mirror selected is 1,1,25296896:8192 07/22/2017 12:34:28 AM INFO:Data before corruption : 1b000100 07/22/2017 12:34:28 AM INFO:Corrupting disk object 6 at 1,1,25296896:8192 07/22/2017 12:34:28 AM INFO:Data after corruption : 00000000 I am expecting something like this # Filename : /var/01010101/test01log object: 1,1,25296896:8192 checksum : 1b000100 offset: 18 size:4 # Filename : /var/01010101/test03log object: 1,2,25296896:8192 checksum : 1b200120 offset: 8 size:8 Here is how I have started coding this but not sure how to to group multiple patterns and return it as a function , I am trying with group() amd groupdicts() any tips and better idea import glob import re for filename in sorted(glob.glob('/var/01010101/test*.log')): with open(filename, 'r') as f: for linenum, line in enumerate(f): m = re.search(r'(Corrupting.*)',line) if not m: # uninteresting line continue x = m.group().split() print filename , x[-1] x123-45# python test.py /var/01010101/test01_.log 1,1,25296896:8192 I am on Python 2.7 and Linux Regards, Ganesh From woooee at gmail.com Sun Jul 23 16:27:50 2017 From: woooee at gmail.com (woooee at gmail.com) Date: Sun, 23 Jul 2017 13:27:50 -0700 (PDT) Subject: Tips to match multiple patterns from from a single file . In-Reply-To: References: Message-ID: <776dde82-e09e-4508-8f0e-eab9c84e2ee5@googlegroups.com> You want to find strings on multiple lines of a file, and possible multiple string groups within a file (you didn't say). So you will have to check each line and store anything you want to keep for each file, and you will have to add the code to cycle through the files, so it is something along the lines of. def find_a_string(string_in, rec): return_str="" string_len=len(string_in) if string_in in rec: location=rec.find(string_in) start=location+string_len ## go until you find a space or end of line for letter in rec[start:]: if len(letter.strip()): return_str += letter else: return return_str return return_str test_file=test_data.split("\n") ## turn into data like file_handle.readlines() found_dict={} for rec in test_file: if len(rec.strip()): for str in ("offset=", "Data before corruption : ", "size="): found=find_a_string(str, rec) if len(found): found_dict[str]=found ## assume this always comes after the above strings in the file if "Corrupting disk object 6 at 1,1,25296896:8192" in rec: print "object 1,1,25296896:8192", for key in found_dict: print key.strip(), found_dict[key], print From kunal123jamdade at gmail.com Mon Jul 24 00:57:25 2017 From: kunal123jamdade at gmail.com (Kunal Jamdade) Date: Mon, 24 Jul 2017 10:27:25 +0530 Subject: Script to replace contents inside the files Message-ID: I have thousands of html files inside a folder. I want to replace the filename present inside another files. Say for ex:- fileName :- 'abcd1234.html' is found inside another file say file2.html. Then I want to remove the last 4 digits of the fileName i.e,. 'abcd1234.html' => 'abcd.htm'. I have tried a script . But your suggestions upon the script are welcomed. Regards, Kunal -------------- next part -------------- import os import re def script_to_create_folder(): path_list = [] filename_list = [] path = r'D:\macrocodesrequired\Testing_Script\Real_testing_\New folder\brpt' #path = r'H:\Script_Work\New_folder\Actual_testing\brpt' for (root, dirs, name) in os.walk(path): for nm in name: if ( 'About' in root or 'Community' in root or 'support' in root \ or 'home' in root or 'Products' in root or 'service' in root \ or 'solutions' in root or 'training' in root \ or 'wheretobuy' in root ): pass if ( 'default' in nm or 'index' in nm or 'category' in nm \ or 'Category' in nm or 'Default' in nm or 'Index' in nm \ or 'home' in nm or 'support' in nm ): pass else: filename_list.append(nm) path_list.append(os.path.join(root, nm)) # print(path_list) # print(filename_list) for path in path_list: for names in filename_list: find_filename_inside_files(names, path) def find_filename_inside_files(file_name, dir_path): pattern_list = ['\d+$', '\d+\w$', '\d+-\d$', '\w\d+$', '\d\w\d\w', '\w\d+$', '\w\d\w\d'] data = [] replace_str = '' read_cnt = 0 digits_to_replace = 0 with open(dir_path, 'r', encoding='utf-8') as file_handle: data = file_handle.read() #print(data) if file_name in data: #print(file_name) for search_pattern in pattern_list: read_cnt = 0 if '-' in file_name: #print("===>",search_pattern) if re.search(search_pattern, file_name.split('.')[0]): digits_to_replace = filename_with_hypen(file_name, search_pattern) read_cnt = 1 position = file_handle.tell() replace_str = replace_oldstring_newstring( data, file_name, digits_to_replace ) # file_handle.seek(0, 0) # file_handle.write(replace_str) elif re.search(search_pattern, file_name.split('.')[0]): digits_to_replace = filename_without_hypen(file_name, search_pattern) read_cnt = 1 replace_str = replace_oldstring_newstring(data, file_name, digits_to_replace) if read_cnt == 1: #print("write to") print(file_name) print(dir_path) with open(dir_path, 'w', encoding='utf-8') as file_out: file_out.write(replace_str) exit() def filename_without_hypen(file_name, pattern): #print(file_name) value = re.search(pattern, file_name.split('.')[0]) if bool(value): last_digits = value.group() if len(last_digits) > 2: return -(len(last_digits)) elif len(last_digits) > 0 and len(last_digits) <= 3: return -(len(last_digits)) def filename_with_hypen(file_name, pattern): value = re.search(pattern, file_name.split('.')[0]) if bool(value): last_digits = value.group() if '-2' in last_digits or '-3' in last_digits: return -(len(last_digits)) else: return -(len(last_digits)) def replace_oldstring_newstring(data, filename, last_digits_to_replace): print("in replace") ind = data.index(filename) temp_str = data[ind:(ind + len(filename))] replace_str = data.replace(temp_str.split('.')[0][last_digits_to_replace:], '') replace_str = replace_str.replace(".html", ".htm") return replace_str def main(): script_to_create_folder() if __name__ == '__main__': main() From casesolutionscentre at gmail.com Mon Jul 24 02:39:34 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:39:34 -0700 (PDT) Subject: Case Solution: Jabong.com Balancing the Demands of Customers and Suppliers by Jaydeep Mukherjee, Punit Bhardwaj Message-ID: <99672567-7f1e-4bf9-a9da-d685ace0fa8b@googlegroups.com> Case Solution and Analysis of Jabong.com: Balancing the Demands of Customers and Suppliers by Jaydeep Mukherjee, Punit Bhardwaj is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9B16A028 / W16394 Get Case Study Solution and Analysis of Jabong.com: Balancing the Demands of Customers and Suppliers in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:40:46 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:40:46 -0700 (PDT) Subject: Case Solution: Sustainable Investing at Generation Investment Management by Stefan Reichelstein, Donna Bebb Message-ID: <9fcbab39-fc62-4170-b2ff-64a15ffae31c@googlegroups.com> Case Solution and Analysis of Sustainable Investing at Generation Investment Management by Stefan Reichelstein, Donna Bebb is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: SM257 Get Case Study Solution and Analysis of Sustainable Investing at Generation Investment Management in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:41:58 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:41:58 -0700 (PDT) Subject: Case Solution: The Cheese and the Oligarchs The Politics, the Media, and Israel's Dream of a Start-Up Nation by Rafael Di Tella, Christine Snively Message-ID: <610f8f0d-523f-4eac-8556-6046be0a7b0e@googlegroups.com> Case Solution and Analysis of The Cheese and the Oligarchs: The Politics, the Media, and Israel's Dream of a Start-Up Nation by Rafael Di Tella, Christine Snively is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9-716-060 Get Case Study Solution and Analysis of The Cheese and the Oligarchs: The Politics, the Media, and Israel's Dream of a Start-Up Nation in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:43:03 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:43:03 -0700 (PDT) Subject: Case Solution: The Sri Lankan Health Crisis and the Middleman by Stephen Grainger Message-ID: <726a09c9-bbaa-448e-b053-417f78abcd66@googlegroups.com> Case Solution and Analysis of The Sri Lankan Health Crisis and the Middleman by Stephen Grainger is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9B16M117 / W16396 Get Case Study Solution and Analysis of The Sri Lankan Health Crisis and the Middleman in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:44:07 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:44:07 -0700 (PDT) Subject: Case Solution: Rajghat Power House The Economy or the Environment by Veena Keshav Pailwar Message-ID: <1a4181ee-ea75-477e-9630-61477dd60478@googlegroups.com> Case Solution and Analysis of Rajghat Power House: The Economy or the Environment by Veena Keshav Pailwar is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9B16M089 / W16385 Get Case Study Solution and Analysis of Rajghat Power House: The Economy or the Environment in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:45:19 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:45:19 -0700 (PDT) Subject: Case Solution: Royal FloraHolland The Dutch Floriculture Supply Chain by P. Fraser Johnson, Ken Mark Message-ID: <680c49ad-47db-4355-8e8a-a6e7c41ecf39@googlegroups.com> Case Solution and Analysis of Royal FloraHolland: The Dutch Floriculture Supply Chain by P. Fraser Johnson, Ken Mark is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9B16D015 / W16377 Get Case Study Solution and Analysis of Royal FloraHolland: The Dutch Floriculture Supply Chain in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:46:24 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:46:24 -0700 (PDT) Subject: Case Solution: Vibrance Kegel Device Capturing Mindshare by Doreen Kum Message-ID: <7e042ed2-d7b2-4ee5-a6d6-3689b5053890@googlegroups.com> Case Solution and Analysis of Vibrance Kegel Device: Capturing Mindshare by Doreen Kum is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9B16A021 / W16387 Get Case Study Solution and Analysis of Vibrance Kegel Device: Capturing Mindshare in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From casesolutionscentre at gmail.com Mon Jul 24 02:47:36 2017 From: casesolutionscentre at gmail.com (Case Solution & Analysis) Date: Sun, 23 Jul 2017 23:47:36 -0700 (PDT) Subject: Case Solution: Amazon and Future Group Rethinking the Alliance Strategy by Meeta Dasgupta Message-ID: <7e8dc260-a191-4080-aff3-08e72405ea82@googlegroups.com> Case Solution and Analysis of Amazon and Future Group: Rethinking the Alliance Strategy by Meeta Dasgupta is available at a lowest price, send email to casesolutionscentre(at)gmail(dot)com if you want to order the Case Solution. Case Study ID: 9B16M108 / W16382 Get Case Study Solution and Analysis of Amazon and Future Group: Rethinking the Alliance Strategy in a FAIR PRICE!! Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . YOU MUST WRITE THE FOLLOWING WHILE PLACING YOUR ORDER: Complete Case Study Name Authors Case Study ID Publisher of Case Study Your Requirements / Case Questions Note: Do not REPLY to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From __peter__ at web.de Mon Jul 24 12:37:18 2017 From: __peter__ at web.de (Peter Otten) Date: Mon, 24 Jul 2017 18:37:18 +0200 Subject: Script to replace contents inside the files References: Message-ID: Kunal Jamdade wrote: > I have thousands of html files inside a folder. I want to replace the > filename present inside another files. Say for ex:- fileName :- > 'abcd1234.html' is found inside another file say file2.html. Then I want > to remove the last 4 digits of the fileName i.e,. 'abcd1234.html' => > 'abcd.htm'. > > I have tried a script . Does your script work? If not post the parts that do not and ask specific questions. > But your suggestions upon the script are welcomed. Your problem exposition looks simpler than your implementation, but it's impossible for anyone but you to tell apart necessary complications from implementation artifacts. Heed the advice given on http://sscce.org/ and you will likely get better answers. From ben+python at benfinney.id.au Mon Jul 24 21:41:54 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 25 Jul 2017 11:41:54 +1000 Subject: Python 3 removes name binding from outer scope Message-ID: <85379l5m3h.fsf@benfinney.id.au> Howdy all, How can I stop Python from deleting a name binding, when that name is used for binding the exception that is caught? When did this change in behaviour come into Python? I am writing code to run on both Python 2 and Python 3:: exc = None try: 1/0 text_template = "All fine!" except ZeroDivisionError as exc: text_template = "Got exception: {exc.__class__.__name__}" print(text_template.format(exc=exc)) Notice that `exc` is explicitly bound before the exception handling, so Python knows it is a name in the outer scope. On Python 2.7, this runs fine and the ?exc? name survives to be used in the ?format? call:: Got exception: ZeroDivisionError Great, this is exactly what I want: The ?except? clause binds the name and I can use that name in the rest of the function to refer to the exception object. On Python 3.5, the ?format? call fails because apparently the ?exc? binding is *deleted*:: Traceback (most recent call last): File "", line 1, in NameError: name 'exc' is not defined Why is the ?exc? binding deleted from the outer scope? How are we meant to reliably preserve the name binding to use it *after* the ?except? clause? When did this change come into Python, where is it documented? Would I be right to report this as a bug in Python 3? -- \ ?The cost of education is trivial compared to the cost of | `\ ignorance.? ?Thomas Jefferson | _o__) | Ben Finney From ethan at stoneleaf.us Mon Jul 24 22:35:53 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Jul 2017 19:35:53 -0700 Subject: Python 3 removes name binding from outer scope In-Reply-To: <85379l5m3h.fsf@benfinney.id.au> References: <85379l5m3h.fsf@benfinney.id.au> Message-ID: <5976AE89.1090701@stoneleaf.us> On 07/24/2017 06:41 PM, Ben Finney wrote: > How can I stop Python from deleting a name binding, when that name is > used for binding the exception that is caught? When did this change in > behaviour come into Python? > > > I am writing code to run on both Python 2 and Python 3:: > > exc = None > try: > 1/0 > text_template = "All fine!" > except ZeroDivisionError as exc: > text_template = "Got exception: {exc.__class__.__name__}" > > print(text_template.format(exc=exc)) Something like: try: .... except ZeroDivisionError as dead_exc: exc = dead_exc .... .... print(text_template.format(exc=exc) > Why is the ?exc? binding deleted from the outer scope? Help prevent memory leaks and allow resources to be cleaned up sooner. > How are we meant > to reliably preserve the name binding to use it *after* the ?except? > clause? Reassign to something else, like my example above. > When did this change come into Python, where is it documented? Documented at: https://docs.python.org/3/reference/compound_stmts.html#the-try-statement [1] Don't recall exactly when changed. > Would I be right to report this as a bug in Python 3? No. -- ~Ethan~ [1] Thanks to https://stackoverflow.com/q/29268892/208880 From steve+python at pearwood.info Mon Jul 24 22:52:17 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Tue, 25 Jul 2017 12:52:17 +1000 Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> Message-ID: <5976b262$0$1607$c3e8da3$5496439d@news.astraweb.com> On Tue, 25 Jul 2017 11:41 am, Ben Finney wrote: > Howdy all, > > How can I stop Python from deleting a name binding, when that name is > used for binding the exception that is caught? When did this change in > behaviour come into Python? > > > I am writing code to run on both Python 2 and Python 3:: > > exc = None > try: > 1/0 > text_template = "All fine!" > except ZeroDivisionError as exc: > text_template = "Got exception: {exc.__class__.__name__}" > > print(text_template.format(exc=exc)) > > Notice that `exc` is explicitly bound before the exception handling, so > Python knows it is a name in the outer scope. Ethan has already answered your direct question, but I'd like to make an observation: there's no "outer scope" here, as the try...except statement doesn't introduce a new scope. All your code above runs in a single scope with a single namespace. It isn't that the except block is in a different scope, but that the except statement now explicitly calls "del" on the exception name when the block ends. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From jobmattcon at gmail.com Mon Jul 24 23:53:51 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Mon, 24 Jul 2017 20:53:51 -0700 (PDT) Subject: how to group by function if one of the group has relationship with another one in the group? Message-ID: from itertools import groupby testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] def isneighborlocation(lo1, lo2): if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: return 1 elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: return 1 else: return 0 groupda = groupby(testing1, isneighborlocation) for key, group1 in groupda: print key for thing in group1: print thing expect output 3 group group1 [(1,1)] group2 [(2,3),(2,4] group3 [(3,5),(3,6),(4,6)] From rustompmody at gmail.com Tue Jul 25 00:48:56 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Mon, 24 Jul 2017 21:48:56 -0700 (PDT) Subject: Python 3 removes name binding from outer scope In-Reply-To: References: <85379l5m3h.fsf@benfinney.id.au> Message-ID: On Tuesday, July 25, 2017 at 7:12:44 AM UTC+5:30, Ben Finney quoted Thomas Jefferson's : > The cost of education is trivial compared to the cost of ignorance. An interesting standard of ?trivial?? given? UK has risen to more than ?100 billion for the first time https://www.theguardian.com/money/2017/jun/15/uk-student-loan-debt-soars-to-more-than-100bn coupled with politicians center-staging and flip-flopping on the issue [yesterday's news] http://www.independent.co.uk/news/uk/politics/jeremy-corbyn-labour-student-loans-debt-manifesto-pledge-amnesty-cancel-tuition-fees-a7856161.html While US student debt now stands at $1.3 trillion https://www.forbes.com/sites/zackfriedman/2017/02/21/student-loan-debt-statistics-2017/#2d2db52d5dab From rustompmody at gmail.com Tue Jul 25 01:13:23 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Mon, 24 Jul 2017 22:13:23 -0700 (PDT) Subject: Recent Spam problem Message-ID: Of late there has been an explosion of spam Thought it was only a google-groups (USENET?) issue and would be barred from the mailing list. But then find its there in the mailing list archives as well Typical example: https://mail.python.org/pipermail/python-list/2017-July/724085.html What gives?? On a different note this problem seems to be peculiar to comp.lang.python via google-groups and is absent on ? dozen other google groups I see. Since spammers are unlikely to be choosy about whom they spam: Tentative conclusion: Something about the USENET-ML gateway is more leaky out here than elsewhere From no.email at nospam.invalid Tue Jul 25 02:01:43 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Mon, 24 Jul 2017 23:01:43 -0700 Subject: Recent Spam problem References: Message-ID: <871sp5hx6g.fsf@nightsong.com> Rustom Mody writes: > Since spammers are unlikely to be choosy about whom they spam: > Tentative conclusion: Something about the USENET-ML gateway is more leaky > out here than elsewhere It could be a sort-of DOS attack by some disgruntled idiot. I wonder if the email address in those spam posts actually works. Then there's the weird Italian rants. No idea about those. From no.email at nospam.invalid Tue Jul 25 02:03:32 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Mon, 24 Jul 2017 23:03:32 -0700 Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> Message-ID: <87wp6xgiiz.fsf@nightsong.com> Ben Finney writes: > How can I stop Python from deleting a name binding, when that name is > used for binding the exception that is caught? Use sys.exc_info() From ben+python at benfinney.id.au Tue Jul 25 02:34:02 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 25 Jul 2017 16:34:02 +1000 Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> Message-ID: <85tw213u05.fsf@benfinney.id.au> Ethan Furman writes: > Something like: > > try: > .... > except ZeroDivisionError as dead_exc: > exc = dead_exc > .... > .... > print(text_template.format(exc=exc) That strikes me as busy-work; the name in the ?except? clause already *has* the object, and is a servicable name already. Having to make another name for the same object, merely to avoid some surprising behaviour, is IMO un-Pythonic. > > How are we meant to reliably preserve the name binding to use it > > *after* the ?except? clause? > > Reassign to something else, like my example above. > > > When did this change come into Python, where is it documented? > > Documented at: https://docs.python.org/3/reference/compound_stmts.html#the-try-statement [1] > > Don't recall exactly when changed. The Python 3.0 documentation describes the behaviour:: When an exception has been assigned using ?as target?, it is cleared at the end of the except clause. [?] That means that you have to assign the exception to a different name if you want to be able to refer to it after the except clause. The reason for this is that with the traceback attached to them, exceptions will form a reference cycle with the stack frame, keeping all locals in that frame alive until the next garbage collection occurs. PEP 3000 documents the change:: This PEP intends to resolve this issue [of a reference cycle] by adding a cleanup semantic to ?except? clauses in Python 3 whereby the target name is deleted at the end of the ?except? suite. So that implies it changed with Python 3.0. > > Would I be right to report this as a bug in Python 3? > > No. I do consider it a bug to break that code, from my original message, which *explicitly* has the name already bound before the exception handling begins. But I must concede that, given it's been this way since Python 3.0, it is unlikely a bug report now would get it changed. -- \ ?We should strive to do things in [Gandhi's] spirit? not to use | `\ violence in fighting for our cause, but by non-participation in | _o__) what we believe is evil.? ?Albert Einstein | Ben Finney From rustompmody at gmail.com Tue Jul 25 02:47:17 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Mon, 24 Jul 2017 23:47:17 -0700 (PDT) Subject: Python 3 removes name binding from outer scope In-Reply-To: References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> Message-ID: <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> On Tuesday, July 25, 2017 at 12:04:45 PM UTC+5:30, Ben Finney wrote: > Ethan Furman writes: > > > Something like: > > > > try: > > .... > > except ZeroDivisionError as dead_exc: > > exc = dead_exc > > .... > > .... > > print(text_template.format(exc=exc) > > That strikes me as busy-work; the name in the ?except? clause already > *has* the object, and is a servicable name already. > > Having to make another name for the same object, merely to avoid some > surprising behaviour, is IMO un-Pythonic. > > > > How are we meant to reliably preserve the name binding to use it > > > *after* the ?except? clause? > > > > Reassign to something else, like my example above. > > > > > When did this change come into Python, where is it documented? > > > > Documented at: https://docs.python.org/3/reference/compound_stmts.html#the-try-statement [1] > > > > Don't recall exactly when changed. > > The Python 3.0 documentation describes the behaviour:: > > When an exception has been assigned using ?as target?, it is cleared > at the end of the except clause. [?] > > That means that you have to assign the exception to a different name > if you want to be able to refer to it after the except clause. The > reason for this is that with the traceback attached to them, > exceptions will form a reference cycle with the stack frame, keeping > all locals in that frame alive until the next garbage collection > occurs. > > > > PEP 3000 documents the change:: > > This PEP intends to resolve this issue [of a reference cycle] by > adding a cleanup semantic to ?except? clauses in Python 3 whereby > the target name is deleted at the end of the ?except? suite. > > > > So that implies it changed with Python 3.0. > > > > Would I be right to report this as a bug in Python 3? > > > > No. > > I do consider it a bug to break that code, from my original message, > which *explicitly* has the name already bound before the exception > handling begins. +1 You can call it bug or bug-promoted-to-feature :D I call it surprising because I dont know of any other case in python where a delete is user-detectable ie python's delete of objects always works quietly behind the scenes whereas this adds a leakiness to the memory-management abstraction > > But I must concede that, given it's been this way since Python 3.0, it > is unlikely a bug report now would get it changed. It could be more prominently documented > > -- > \ ?We should strive to do things in [Gandhi's] spirit? not to use | > `\ violence in fighting for our cause, but by non-participation in | > _o__) what we believe is evil.? ?Albert Einstein | > Ben Finney Interesting to see this adjacent to news that non-participation is about to be criminalized double of rape [20 years + a million dollars] http://www.telesurtv.net/english/news/Bipartisan-US-Bill-Moves-to-Criminalize-BDS-Support-20170720-0001.html From skybuck2000 at hotmail.com Tue Jul 25 02:50:50 2017 From: skybuck2000 at hotmail.com (skybuck2000 at hotmail.com) Date: Mon, 24 Jul 2017 23:50:50 -0700 (PDT) Subject: Recent Spam problem In-Reply-To: References: Message-ID: I see two solutions: 1. We build new architecture or adept current one so it's more like a blockchain, have to calculate some hash before being able to post and upload and such. or 2. We counter-attack by installing a special tool, so we all denial of service attack the source of the message, I am not sure if the source is genuine information, what you make of it: NNTP-Posting-Host: 39.52.70.224 Now the first solution would require a lot of work. The second solution would be easy to do. My question to you is: What solution do you pick of any ? =D From ethan at stoneleaf.us Tue Jul 25 03:01:11 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 25 Jul 2017 00:01:11 -0700 Subject: Python 3 removes name binding from outer scope In-Reply-To: <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> Message-ID: <5976ECB7.7070801@stoneleaf.us> On 07/24/2017 11:47 PM, Rustom Mody wrote: > Interesting to see this adjacent to news that non-participation is about > to be criminalized double of [snip] Rustom, All, This is a Python mailing list. Please keep the topics marginally on-topic. Thanks. -- ~Ethan~ From stefan_ml at behnel.de Tue Jul 25 03:01:42 2017 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 25 Jul 2017 09:01:42 +0200 Subject: Python 3 removes name binding from outer scope In-Reply-To: <85tw213u05.fsf@benfinney.id.au> References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> Message-ID: Ben Finney schrieb am 25.07.2017 um 08:34: > Ethan Furman writes: > >> Something like: >> >> try: >> .... >> except ZeroDivisionError as dead_exc: >> exc = dead_exc >> .... >> .... >> print(text_template.format(exc=exc) > > That strikes me as busy-work; the name in the ?except? clause already > *has* the object, and is a servicable name already. > > Having to make another name for the same object, merely to avoid some > surprising behaviour, is IMO un-Pythonic. It's an extremely rare use case and keeping the exception alive after handling has clear drawbacks in terms of resource usage (exception information, tracebacks, frames, local variables, chained exceptions, ...) This tradeoff was the reason why this was changed in Py3k at the time, together with the introduction of exception chaining (and some other cleanups in that corner). Basically, it's better to save resources by default and let users explicitly keep them alive if they still need them, than to implicitly hold on to them in a deep corner of CPython (sys.exc_info()) and let users figure out how to release them explicitly if they find out that they hurt and then additionally manage to debug where they are stored. Py2.x did the latter, and guess how many users knew about it? Stefan From ben+python at benfinney.id.au Tue Jul 25 03:02:48 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 25 Jul 2017 17:02:48 +1000 Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> Message-ID: <85pocp3so7.fsf@benfinney.id.au> Ben Finney writes: > Having to make another name for the same object, merely to avoid some > surprising behaviour, is IMO un-Pythonic. I suppose my objection is rooted in the fact this behaviour is implicit; my code has not issued a ?del? statement, and so I don't expect one; yet it occurs implicitly. This violates the Zen of Python. > PEP 3000 documents the change:: I got the wrong number there; it's PEP 3110. -- \ ?Please do not feed the animals. If you have any suitable food, | `\ give it to the guard on duty.? ?zoo, Budapest | _o__) | Ben Finney From steve+comp.lang.python at pearwood.info Tue Jul 25 03:21:43 2017 From: steve+comp.lang.python at pearwood.info (Steven D'Aprano) Date: 25 Jul 2017 07:21:43 GMT Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> Message-ID: <5976f183$0$2878$c3e8da3$76491128@news.astraweb.com> On Mon, 24 Jul 2017 21:48:56 -0700, Rustom Mody wrote: > On Tuesday, July 25, 2017 at 7:12:44 AM UTC+5:30, Ben Finney quoted > Thomas Jefferson's : > >> The cost of education is trivial compared to the cost of ignorance. > > > An interesting standard of ?trivial?? given? You're reading the quote out of context. When Thomas Jefferson wrote what he did, he was comparing the cost to the US government of paying for universal education for a subset of the population (mostly white males under 16, I expect) to the relatively low standards required at the time, versus the societal costs of a broad population of know-nothings. Especially know-nothings who have the vote. Despite the shift to universal education, and the general increase in standards, I believe Jefferson's equation still broadly holds. We probably wouldn't find it cost effective to educate everyone to a Ph.D. standard, but to a secondary school standard is very affordable. Jefferson wasn't comparing the cost of ignorance to *student debt* because such a thing didn't exist in his day. I don't believe that Jefferson imagined that a societal Good like universal education would be treated as not just a *profit centre*, but *weaponized* and deployed against the middle class. In the US and UK, and a lesser extend Australia, we have managed to combine the worst of both worlds: - a system which spends a huge amount of money for degrees which, for the majority of people, will never repay their cost; - that cost is charged to the receiver, ensuring that the majority of them will start their working career in debt, and often that they will never pay of that debt during their working life; - ultimately leading to a transfer of assets from the middle-class to the elites; - while nevertheless keeping the general population remarkably ignorant. Jefferson was, in a sense, naive: while he recognised the rather brutal costs of ignorance, he assumed that well-meaning people of good will would agree that they were costs. Unfortunately ignorance is an exploitable externality and to some people, the ignorance of others is a benefit, not a cost. The people who gain benefit from ignorance are not the ones who pay the costs. Consequently we have sectors of the political elite who gain benefit from the ignorance of others, while the rest of us have to pay the costs: - ignorance encourages people to vote against their own interests; - ignorance can be manipulated by demagogues; - the ignorant and fearful has become a powerful voting block that votes in politicians who do their best to make them more ignorant and more fearful (a vicious circle); - ignorance can be used against subsections of the public by increasing apathy and discouraging them from voting. Likewise there is a vast collection of economic interest groups who thrive on ignorance: - scammers and spammers; - the advertising profession in general; - merchants of woo, such as those who invent dangerous fad diets and the anti-vaxxers; - PR firms that exist to obfuscate the facts ("Doubt is our product", as one such firm said to the tobacco companies); - media that thrives on inventing fake controversy and false equivalency; and so on. -- ?You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.? ?Theo de Raadt From rosuav at gmail.com Tue Jul 25 03:28:47 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 25 Jul 2017 17:28:47 +1000 Subject: Python 3 removes name binding from outer scope In-Reply-To: <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> Message-ID: On Tue, Jul 25, 2017 at 4:47 PM, Rustom Mody wrote: > +1 > You can call it bug or bug-promoted-to-feature :D > > I call it surprising because I dont know of any other case in python where > a delete is user-detectable > ie python's delete of objects always works quietly behind the scenes whereas > this adds a leakiness to the memory-management abstraction You're conflating two things. There's nothing here that forces the destruction of an object; the name is simply unbound. You can confirm this from the disassembly in CPython: >>> import dis >>> def f(): ... try: 1/0 ... except Exception as e: pass ... >>> dis.dis(f) 2 0 SETUP_EXCEPT 12 (to 14) 2 LOAD_CONST 1 (1) 4 LOAD_CONST 2 (0) 6 BINARY_TRUE_DIVIDE 8 POP_TOP 10 POP_BLOCK 12 JUMP_FORWARD 34 (to 48) 3 >> 14 DUP_TOP 16 LOAD_GLOBAL 0 (Exception) 18 COMPARE_OP 10 (exception match) 20 POP_JUMP_IF_FALSE 46 22 POP_TOP 24 STORE_FAST 0 (e) 26 POP_TOP 28 SETUP_FINALLY 6 (to 36) 30 POP_BLOCK 32 POP_EXCEPT 34 LOAD_CONST 0 (None) >> 36 LOAD_CONST 0 (None) 38 STORE_FAST 0 (e) 40 DELETE_FAST 0 (e) 42 END_FINALLY 44 JUMP_FORWARD 2 (to 48) >> 46 END_FINALLY >> 48 LOAD_CONST 0 (None) 50 RETURN_VALUE >>> It actually does the equivalent of: finally: e = None del e In the normal case, this will leave the original exception loose and garbage-collectable, but if it's been bound to another name, it won't. ChrisA From rosuav at gmail.com Tue Jul 25 03:29:56 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 25 Jul 2017 17:29:56 +1000 Subject: Recent Spam problem In-Reply-To: References: Message-ID: On Tue, Jul 25, 2017 at 4:50 PM, wrote: > I see two solutions: > > 1. We build new architecture or adept current one so it's more like a blockchain, have to calculate some hash before being able to post and upload and such. > > or > > 2. We counter-attack by installing a special tool, so we all denial of service attack the source of the message, I am not sure if the source is genuine information, what you make of it: > > NNTP-Posting-Host: 39.52.70.224 > > > Now the first solution would require a lot of work. > > The second solution would be easy to do. > > My question to you is: > > What solution do you pick of any ? =D There are bad people in the world. I know! Let's all go and drop nuclear bombs on them. That'll fix the problem! OR... you could try just filtering it all out, and not stooping to their level. ChrisA From steve+comp.lang.python at pearwood.info Tue Jul 25 03:32:41 2017 From: steve+comp.lang.python at pearwood.info (Steven D'Aprano) Date: 25 Jul 2017 07:32:41 GMT Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> <5976ECB7.7070801@stoneleaf.us> Message-ID: <5976f419$0$2878$c3e8da3$76491128@news.astraweb.com> On Tue, 25 Jul 2017 00:01:11 -0700, Ethan Furman wrote: > On 07/24/2017 11:47 PM, Rustom Mody wrote: > >> Interesting to see this adjacent to news that non-participation is >> about to be criminalized double of [snip] > > Rustom, All, > > This is a Python mailing list. Please keep the topics marginally > on-topic. Thanks. Is this a new rule? :-P This list has always (well, at least for a decade or more) been relatively forgiving of off-topic posts. But we should at least label the subject lines as off-topic, which I admit I forgot to do on my previous post. Sorry. -- ?You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.? ?Theo de Raadt From steve+comp.lang.python at pearwood.info Tue Jul 25 03:34:07 2017 From: steve+comp.lang.python at pearwood.info (Steven D'Aprano) Date: 25 Jul 2017 07:34:07 GMT Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <85pocp3so7.fsf@benfinney.id.au> Message-ID: <5976f46f$0$2878$c3e8da3$76491128@news.astraweb.com> On Tue, 25 Jul 2017 17:02:48 +1000, Ben Finney wrote: > Ben Finney writes: > >> Having to make another name for the same object, merely to avoid some >> surprising behaviour, is IMO un-Pythonic. > > I suppose my objection is rooted in the fact this behaviour is implicit; > my code has not issued a ?del? statement, and so I don't expect one; yet > it occurs implicitly. This violates the Zen of Python. Technically, *all* garbage collection is implicit. You don't have to explicitly delete your local variables when you return from a function, they are implicitly deleted when they go out of scope. So consider this as a de-facto "the except clause "as name" variable is treated *as if* it exists in its own scope. I agree that the behaviour of except is a little surprising, but that's (according to the core devs) the lesser of two evils. The alternative is a memory leak when the traceback keeps data alive that you didn't expect. -- ?You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.? ?Theo de Raadt From yasirrbadamasi at gmail.com Tue Jul 25 03:48:25 2017 From: yasirrbadamasi at gmail.com (yasirrbadamasi at gmail.com) Date: Tue, 25 Jul 2017 00:48:25 -0700 (PDT) Subject: I am new here and i need your help please Message-ID: I have never execute any program before using python and a task was given to me by my teacher ~ to write a python program to print my details and store in a third party variables. ~ the details include name, age, height, status. so please your help is highly needed, thanks From ben+python at benfinney.id.au Tue Jul 25 04:10:00 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 25 Jul 2017 18:10:00 +1000 Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <85pocp3so7.fsf@benfinney.id.au> <5976f46f$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: <85lgnd3pk7.fsf@benfinney.id.au> Steven D'Aprano writes: > On Tue, 25 Jul 2017 17:02:48 +1000, Ben Finney wrote: > > > I suppose my objection is rooted in the fact this behaviour is > > implicit; my code has not issued a ?del? statement, and so I don't > > expect one; yet it occurs implicitly. This violates the Zen of > > Python. > > Technically, *all* garbage collection is implicit. The surprising behaviour is not garbage collection; I hadn't noticed anything different with garbage collection. The surprising behaviour is the unasked-for removal of a name binding from the context, that was bound earlier in the same context. > So consider this as a de-facto "the except clause "as name" variable > is treated *as if* it exists in its own scope. It is not, though. ?as if it exists in its own scope? would mean that it should not affect the earlier binding in a different scope. That's not what happens; an earlier binding is re-bound (explicitly, by ?as target?) and then *implicitly* removed so the name is unbound. > I agree that the behaviour of except is a little surprising, but > that's (according to the core devs) the lesser of two evils. The > alternative is a memory leak when the traceback keeps data alive that > you didn't expect. I think those are not the only two options (the ?except clause has its own scope? behaviour is an option that could have been chosen, for example). At this point the behaviour and motivation are clear, having been revealed; I disagree with the behaviour, and think the motivation could have been met better. -- \ ?If you always want the latest and greatest, then you have to | `\ buy a new iPod at least once a year.? ?Steve Jobs, MSNBC | _o__) interview 2006-05-25 | Ben Finney From steve+comp.lang.python at pearwood.info Tue Jul 25 04:32:45 2017 From: steve+comp.lang.python at pearwood.info (Steven D'Aprano) Date: 25 Jul 2017 08:32:45 GMT Subject: I am new here and i need your help please References: Message-ID: <5977022d$0$2878$c3e8da3$76491128@news.astraweb.com> On Tue, 25 Jul 2017 00:48:25 -0700, yasirrbadamasi wrote: > I have never execute any program before using python and a task was > given to me by my teacher ~ to write a python program to print my > details and store in a third party variables. > ~ the details include name, age, height, status. so please your help is > highly needed, thanks What part of the assignment don't you understand? -- ?You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.? ?Theo de Raadt From steve+comp.lang.python at pearwood.info Tue Jul 25 04:33:06 2017 From: steve+comp.lang.python at pearwood.info (Steven D'Aprano) Date: 25 Jul 2017 08:33:06 GMT Subject: OT was Re: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> Message-ID: <59770242$0$2878$c3e8da3$76491128@news.astraweb.com> On Mon, 24 Jul 2017 23:47:17 -0700, Rustom Mody wrote: [...] > Bipartisan-US-Bill-Moves-to-Criminalize-BDS-Support-20170720-0001.html Heh, at first I read that as a bill to criminalise BSD support :-) -- ?You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.? ?Theo de Raadt From larry at hastings.org Tue Jul 25 04:37:02 2017 From: larry at hastings.org (Larry Hastings) Date: Tue, 25 Jul 2017 01:37:02 -0700 Subject: RELEASED] Python 3.4.7rc1 and Python 3.5.4rc1 are now available Message-ID: On behalf of the Python development community and the Python 3.4 and Python 3.5 release teams, I'm relieved to announce the availability of Python 3.4.7rc1 and Python 3.5.4rc1. Python 3.4 is now in "security fixes only" mode. This is the final stage of support for Python 3.4. Python 3.4 now only receives security fixes, not bug fixes, and Python 3.4 releases are source code only--no more official binary installers will be produced. Python 3.5.4 will be the final 3.5 release in "bug fix" mode. After 3.5.4 is released, Python 3.5 will also move into "security fixes mode". Both these releases are "release candidates". They should not be considered the final releases, although the final releases should contain only minor differences. Python users are encouraged to test with these releases and report any problems they encounter. You can find Python 3.4.7rc1 here: https://www.python.org/downloads/release/python-347rc1/ And you can find Python 3.5.4rc1 here: https://www.python.org/downloads/release/python-354rc1/ Python 3.4.7 final and Python 3.5.4 final are both scheduled for release on August 6th, 2017. Happy Pythoning, //arry/ From rosuav at gmail.com Tue Jul 25 04:43:15 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 25 Jul 2017 18:43:15 +1000 Subject: Python 3 removes name binding from outer scope In-Reply-To: <85lgnd3pk7.fsf@benfinney.id.au> References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <85pocp3so7.fsf@benfinney.id.au> <5976f46f$0$2878$c3e8da3$76491128@news.astraweb.com> <85lgnd3pk7.fsf@benfinney.id.au> Message-ID: On Tue, Jul 25, 2017 at 6:10 PM, Ben Finney wrote: > I think those are not the only two options (the ?except clause has its > own scope? behaviour is an option that could have been chosen, for > example). > > At this point the behaviour and motivation are clear, having been > revealed; I disagree with the behaviour, and think the motivation could > have been met better. Having a new scope introduced by a non-function would also be surprising. List comprehensions (as of Py3) introduce a new scope, but they do so by being wrapped in a function. Would you expect that an except block is wrapped in a function too? And whether or not it's a function, the fact would remain that except blocks would differ from other name-binding blocks (with/as and for). There's a reasonable argument for with/as to introduce a new scope, but a for loop definitely shouldn't, so there's going to be a difference there. And if it's its own function, you'd need to use nonlocal/global inside it to make any other changes. That would be a royal pain. Consider: try: raw_input except NameError: global raw_input # blech raw_input = input I'm not actually sure what happens if you use a global declaration at top level. Is it ignored? Is it an error? One potential solution would be to have *just the exception name* in a sort of magic scope. But I'm not sure how that would go. TBH, I rather suspect that the exact semantics here aren't too critical. If there's a solid argument for some variation on the current system, one that still prevents the refloop between the exception and a function's variables, it'd be worth discussing as a proposal. At very least, it's worth raising it on -ideas... worst case, it gets shut down quickly. ChrisA From rosuav at gmail.com Tue Jul 25 04:43:59 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 25 Jul 2017 18:43:59 +1000 Subject: OT was Re: Python 3 removes name binding from outer scope In-Reply-To: <59770242$0$2878$c3e8da3$76491128@news.astraweb.com> References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> <59770242$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: On Tue, Jul 25, 2017 at 6:33 PM, Steven D'Aprano wrote: > On Mon, 24 Jul 2017 23:47:17 -0700, Rustom Mody wrote: > > > [...] >> Bipartisan-US-Bill-Moves-to-Criminalize-BDS-Support-20170720-0001.html > > > Heh, at first I read that as a bill to criminalise BSD support :-) > I spluttered my drink on reading that. Good job Steven! ChrisA From __peter__ at web.de Tue Jul 25 04:59:43 2017 From: __peter__ at web.de (Peter Otten) Date: Tue, 25 Jul 2017 10:59:43 +0200 Subject: how to group by function if one of the group has relationship with another one in the group? References: Message-ID: Ho Yeung Lee wrote: > from itertools import groupby > > testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > def isneighborlocation(lo1, lo2): > if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: > return 1 > elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: > return 1 > else: > return 0 > > groupda = groupby(testing1, isneighborlocation) > for key, group1 in groupda: > print key > for thing in group1: > print thing > > expect output 3 group > group1 [(1,1)] > group2 [(2,3),(2,4] > group3 [(3,5),(3,6),(4,6)] groupby() calculates the key value from the current item only, so there's no "natural" way to apply it to your problem. Possible workarounds are to feed it pairs of neighbouring items (think zip()) or a stateful key function. Below is an example of the latter: $ cat sequential_group_class.py from itertools import groupby missing = object() class PairKey: def __init__(self, continued): self.prev = missing self.continued = continued self.key = False def __call__(self, item): if self.prev is not missing and not self.continued(self.prev, item): self.key = not self.key self.prev = item return self.key def isneighborlocation(lo1, lo2): x1, y1 = lo1 x2, y2 = lo2 dx = x1 - x2 dy = y1 - y2 return dx*dx + dy*dy <= 1 items = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] for key, group in groupby(items, key=PairKey(isneighborlocation)): print key, list(group) $ python sequential_group_class.py False [(1, 1)] True [(2, 3), (2, 4)] False [(3, 5), (3, 6), (4, 6)] From tjol at tjol.eu Tue Jul 25 07:38:15 2017 From: tjol at tjol.eu (Thomas Jollans) Date: Tue, 25 Jul 2017 13:38:15 +0200 Subject: Python 3 removes name binding from outer scope In-Reply-To: References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> Message-ID: <577253d8-5749-65ce-335b-f2a51cd9969d@tjol.eu> On 2017-07-25 09:28, Chris Angelico wrote: > On Tue, Jul 25, 2017 at 4:47 PM, Rustom Mody wrote: >> +1 >> You can call it bug or bug-promoted-to-feature :D >> >> I call it surprising because I dont know of any other case in python where >> a delete is user-detectable >> ie python's delete of objects always works quietly behind the scenes whereas >> this adds a leakiness to the memory-management abstraction > > You're conflating two things. There's nothing here that forces the > destruction of an object; the name is simply unbound. You can confirm > this from the disassembly in CPython: > >>>> import dis >>>> def f(): > ... try: 1/0 > ... except Exception as e: pass > ... >>>> dis.dis(f) > 2 0 SETUP_EXCEPT 12 (to 14) > 2 LOAD_CONST 1 (1) > 4 LOAD_CONST 2 (0) > 6 BINARY_TRUE_DIVIDE > 8 POP_TOP > 10 POP_BLOCK > 12 JUMP_FORWARD 34 (to 48) > > 3 >> 14 DUP_TOP > 16 LOAD_GLOBAL 0 (Exception) > 18 COMPARE_OP 10 (exception match) > 20 POP_JUMP_IF_FALSE 46 > 22 POP_TOP > 24 STORE_FAST 0 (e) > 26 POP_TOP > 28 SETUP_FINALLY 6 (to 36) > 30 POP_BLOCK > 32 POP_EXCEPT > 34 LOAD_CONST 0 (None) > >> 36 LOAD_CONST 0 (None) > 38 STORE_FAST 0 (e) > 40 DELETE_FAST 0 (e) > 42 END_FINALLY > 44 JUMP_FORWARD 2 (to 48) > >> 46 END_FINALLY > >> 48 LOAD_CONST 0 (None) > 50 RETURN_VALUE >>>> > > It actually does the equivalent of: > > finally: > e = None I wonder why it would bother to load None... (as someone not very familiar with Python at the bytecode level) -- Thomas > del e > > In the normal case, this will leave the original exception loose and > garbage-collectable, but if it's been bound to another name, it won't. > > ChrisA > From alister.ware at ntlworld.com Tue Jul 25 11:03:00 2017 From: alister.ware at ntlworld.com (alister) Date: Tue, 25 Jul 2017 15:03:00 GMT Subject: Recent Spam problem References: Message-ID: On Tue, 25 Jul 2017 17:29:56 +1000, Chris Angelico wrote: > On Tue, Jul 25, 2017 at 4:50 PM, wrote: >> I see two solutions: >> >> 1. We build new architecture or adept current one so it's more like a >> blockchain, have to calculate some hash before being able to post and >> upload and such. >> >> or >> >> 2. We counter-attack by installing a special tool, so we all denial of >> service attack the source of the message, I am not sure if the source >> is genuine information, what you make of it: >> >> NNTP-Posting-Host: 39.52.70.224 >> >> >> Now the first solution would require a lot of work. >> >> The second solution would be easy to do. >> >> My question to you is: >> >> What solution do you pick of any ? =D > > There are bad people in the world. I know! Let's all go and drop nuclear > bombs on them. That'll fix the problem! > > OR... you could try just filtering it all out, and not stooping to their > level. > > ChrisA i say nuke em/ otherwise my /dev/null is going to need expanding ;-) -- "To YOU I'm an atheist; to God, I'm the Loyal Opposition." -- Woody Allen From devilsgrin94 at gmail.com Tue Jul 25 13:01:11 2017 From: devilsgrin94 at gmail.com (Rahul Sircar) Date: Tue, 25 Jul 2017 10:01:11 -0700 (PDT) Subject: python file downloader not working Message-ID: So I recently tried to write a script using urllib2 module. Here is the code below: import urllib2 file = 'metasploitable-linux-2.0.0.zip' url='https://downloads.sourceforge.net/project/metasploitable/Metasploitable2/metasploitable-linux-2.0.0.zip' response = urllib2.urlopen(url) fh=open(file,'w') fh.write(response.read()) fh.close() I am getting this error in the output. Traceback (most recent call last): File "urllib_read.py", line 6, in fh.write(response.read()) File "E:\Python27\lib\socket.py", line 355, in read data = self._sock.recv(rbufsize) File "E:\Python27\lib\httplib.py", line 597, in read s = self.fp.read(amt) File "E:\Python27\lib\socket.py", line 384, in read data = self._sock.recv(left) File "E:\Python27\lib\ssl.py", line 766, in recv return self.read(buflen) File "E:\Python27\lib\ssl.py", line 653, in read v = self._sslobj.read(len) socket.error: [Errno 10053] An established connection was aborted by the software in your host machine. Please someone help me out. From bintacomputers at gmail.com Tue Jul 25 13:18:42 2017 From: bintacomputers at gmail.com (Umar Yusuf) Date: Tue, 25 Jul 2017 10:18:42 -0700 (PDT) Subject: Python BeautifulSoup extract html table cells that contains images and text Message-ID: <4602c354-34b5-4962-b1d1-59d110c6cd36@googlegroups.com> Hi all, I need help extracting the table from this url...? from bs4 import BeautifulSoup url = "https://www.marinetraffic.com/en/ais/index/ports/all/per_page:50" headers = {'User-agent': 'Mozilla/5.0'} raw_html = requests.get(url, headers=headers) raw_data = raw_html.text soup_data = BeautifulSoup(raw_data, "lxml") td = soup_data.findAll('tr')[1:] country = [] for data in td: col = data.find_all('td') country.append(col) From eryksun at gmail.com Tue Jul 25 14:36:54 2017 From: eryksun at gmail.com (eryk sun) Date: Tue, 25 Jul 2017 18:36:54 +0000 Subject: Python 3 removes name binding from outer scope In-Reply-To: References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <85pocp3so7.fsf@benfinney.id.au> <5976f46f$0$2878$c3e8da3$76491128@news.astraweb.com> <85lgnd3pk7.fsf@benfinney.id.au> Message-ID: On Tue, Jul 25, 2017 at 8:43 AM, Chris Angelico wrote: > > I'm not actually sure what happens if you use a global declaration at > top level. Is it ignored? Is it an error? It isn't ignored, but it shouldn't make a difference since normally at module level locals and globals are the same. It makes a difference in an exec() that uses separate locals and globals dicts. For example: >>> exec('print(x)', {'x':'G'}, {'x':'L'}) L >>> exec('global x; print(x)', {'x':'G'}, {'x':'L'}) G From best_lay at yahoo.com Tue Jul 25 15:31:01 2017 From: best_lay at yahoo.com (Wildman) Date: Tue, 25 Jul 2017 14:31:01 -0500 Subject: Recent Spam problem References: <871sp5hx6g.fsf@nightsong.com> Message-ID: On Mon, 24 Jul 2017 23:01:43 -0700, Paul Rubin wrote: > Rustom Mody writes: >> Since spammers are unlikely to be choosy about whom they spam: >> Tentative conclusion: Something about the USENET-ML gateway is more leaky >> out here than elsewhere > > It could be a sort-of DOS attack by some disgruntled idiot. I wonder if > the email address in those spam posts actually works. Then there's the > weird Italian rants. No idea about those. The posts are being made through Google Groups. Forwarding the posts with headers to groups-abuse at google.com might help. I have sent a couple but if everyone here did it maybe Google will pay attention and do something. The same goes for our Italian "friend". -- GNU/Linux user #557453 The cow died so I don't need your bull! From ian.g.kelly at gmail.com Tue Jul 25 15:52:46 2017 From: ian.g.kelly at gmail.com (Ian Kelly) Date: Tue, 25 Jul 2017 13:52:46 -0600 Subject: Python 3 removes name binding from outer scope In-Reply-To: <577253d8-5749-65ce-335b-f2a51cd9969d@tjol.eu> References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> <577253d8-5749-65ce-335b-f2a51cd9969d@tjol.eu> Message-ID: On Tue, Jul 25, 2017 at 5:38 AM, Thomas Jollans wrote: > On 2017-07-25 09:28, Chris Angelico wrote: >> It actually does the equivalent of: >> >> finally: >> e = None > > I wonder why it would bother to load None... (as someone not very > familiar with Python at the bytecode level) If I may hazard a guess, it's because simply deleting the variable will raise a NameError if the variable is not already bound, e.g. if there was no exception and only the finally block is being executed, or if the programmer already deleted it. Assigning None to ensure the variable is bound is likely faster than testing it. From rosuav at gmail.com Tue Jul 25 16:06:02 2017 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 26 Jul 2017 06:06:02 +1000 Subject: Python 3 removes name binding from outer scope In-Reply-To: References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <85pocp3so7.fsf@benfinney.id.au> <5976f46f$0$2878$c3e8da3$76491128@news.astraweb.com> <85lgnd3pk7.fsf@benfinney.id.au> Message-ID: On Wed, Jul 26, 2017 at 4:36 AM, eryk sun wrote: > On Tue, Jul 25, 2017 at 8:43 AM, Chris Angelico wrote: >> >> I'm not actually sure what happens if you use a global declaration at >> top level. Is it ignored? Is it an error? > > It isn't ignored, but it shouldn't make a difference since normally at > module level locals and globals are the same. It makes a difference in > an exec() that uses separate locals and globals dicts. For example: > > >>> exec('print(x)', {'x':'G'}, {'x':'L'}) > L > >>> exec('global x; print(x)', {'x':'G'}, {'x':'L'}) > G Thanks. Of course, that doesn't change the fact that it'll look very odd - but at least it won't cause a problem. Now, if you wanted to write Py2/Py3 compatibility code inside a function, you'd have issues, because you can't use nonlocal in Py2... but that's a separate issue. Hmm. Aside from messing around with exec, is there any way to have a local and a global with the same name, and use the global? You could do it with a nonlocal: x = "G" def f(): x = "L" def g(): global x print(x) g() but is there any way to engineer this actual situation without exec? ChrisA From best_lay at yahoo.com Tue Jul 25 16:20:42 2017 From: best_lay at yahoo.com (Wildman) Date: Tue, 25 Jul 2017 15:20:42 -0500 Subject: I am new here and i need your help please References: Message-ID: On Tue, 25 Jul 2017 00:48:25 -0700, yasirrbadamasi wrote: > I have never execute any program before using python and a task was given to me by my teacher > ~ to write a python program to print my details and store in a third party variables. > ~ the details include name, age, height, status. so please your help is highly needed, thanks Read the material and write some code the best you can and post it here. Someone will try to help you. No one here is going to do your homework for you. -- GNU/Linux user #557453 May the Source be with you. From grant.b.edwards at gmail.com Tue Jul 25 17:44:22 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Tue, 25 Jul 2017 21:44:22 +0000 (UTC) Subject: Recent Spam problem References: <871sp5hx6g.fsf@nightsong.com> Message-ID: On 2017-07-25, Wildman via Python-list wrote: > The posts are being made through Google Groups. Forwarding > the posts with headers to groups-abuse at google.com might help. I never has in the past. I (and many others) have for years and years been plonking all posts made through Google Groups. Trust me, you'll not miss out on anything worthwile. :) > I have sent a couple but if everyone here did it maybe Google will > pay attention and do something. They won't. Just configure your .score file (or bogofilter or spamassassin or whatever) to throw out all posts that have a Message-ID: header field that ends in 'googlegroups.com'. That, grashopper, is the path to serenity. -- Grant Edwards grant.b.edwards Yow! I like your SNOOPY at POSTER!! gmail.com From michael.stemper at gmail.com Tue Jul 25 18:45:06 2017 From: michael.stemper at gmail.com (Michael F. Stemper) Date: Tue, 25 Jul 2017 17:45:06 -0500 Subject: Recent Spam problem In-Reply-To: References: Message-ID: On 2017-07-25 10:03, alister wrote: > On Tue, 25 Jul 2017 17:29:56 +1000, Chris Angelico wrote: >> On Tue, Jul 25, 2017 at 4:50 PM, wrote: >>> I see two solutions: >>> >>> 1. We build new architecture or adept current one so it's more like a >>> blockchain, have to calculate some hash before being able to post and >>> upload and such. >>> 2. We counter-attack by installing a special tool, so we all denial of >>> service attack the source of the message, I am not sure if the source >>> is genuine information, what you make of it: >> There are bad people in the world. I know! Let's all go and drop nuclear >> bombs on them. That'll fix the problem! >> >> OR... you could try just filtering it all out, and not stooping to their >> level. > > i say nuke em/ otherwise my /dev/null is going to need expanding ;-) I just got a new nulldev from Data General about a week back: username at hostname$ ll /dev/null crw-rw-rw- 1 root root 1, 3 Jul 16 15:22 /dev/null username at hostname$ Its performance is awesome, and it comes with a powerful GUI for configuring and customizing. -- Michael F. Stemper Indians scattered on dawn's highway bleeding; Ghosts crowd the young child's fragile eggshell mind. From steve+python at pearwood.info Tue Jul 25 20:20:55 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Wed, 26 Jul 2017 10:20:55 +1000 Subject: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <85pocp3so7.fsf@benfinney.id.au> <5976f46f$0$2878$c3e8da3$76491128@news.astraweb.com> <85lgnd3pk7.fsf@benfinney.id.au> Message-ID: <5977e069$0$1596$c3e8da3$5496439d@news.astraweb.com> On Wed, 26 Jul 2017 06:06 am, Chris Angelico wrote: > Hmm. Aside from messing around with exec, is there any way to have a > local and a global with the same name, and use the global? Use globals['name']. There's no way to do it with regular name look ups. This doesn't work: spam = 'outside' def func(): spam = 'inside' assert spam == 'inside' global spam assert spam == 'outside' The problem is that global is a declaration, not an executable statement, and if it is *anywhere* inside function, it is deemed to apply to the entire function. Apart from that, you could try hacking the byte-code. It should be a matter of just replacing LOAD_FAST with LOAD_GLOBAL, I think. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From nomail at com.invalid Wed Jul 26 02:05:37 2017 From: nomail at com.invalid (ast) Date: Wed, 26 Jul 2017 08:05:37 +0200 Subject: how to group by function if one of the group has relationship with another one in the group? In-Reply-To: References: Message-ID: <59783134$0$3605$426a74cc@news.free.fr> "Ho Yeung Lee" a ?crit dans le message de news:ef0bd11a-bf55-42a2-b016-d93f3b831860 at googlegroups.com... > from itertools import groupby > > testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > def isneighborlocation(lo1, lo2): > if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: > return 1 > elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: > return 1 > else: > return 0 > > groupda = groupby(testing1, isneighborlocation) > for key, group1 in groupda: > print key > for thing in group1: > print thing > > expect output 3 group > group1 [(1,1)] > group2 [(2,3),(2,4] > group3 [(3,5),(3,6),(4,6)] Its not clear to me how you build the groups Why (1,1) is not in group2 since (1,1) is a neighbor to both (2,3) and (2,4) ? From dieter at handshake.de Wed Jul 26 02:40:48 2017 From: dieter at handshake.de (dieter) Date: Wed, 26 Jul 2017 08:40:48 +0200 Subject: I am new here and i need your help please References: Message-ID: <87h8xz4s5r.fsf@handshake.de> yasirrbadamasi at gmail.com writes: > I have never execute any program before using python and a task was given to me by my teacher I suggest to start by reading the Python tutorial: "https://docs.python.org/3/tutorial/index.html". From dieter at handshake.de Wed Jul 26 02:46:32 2017 From: dieter at handshake.de (dieter) Date: Wed, 26 Jul 2017 08:46:32 +0200 Subject: python file downloader not working References: Message-ID: <87d18n4rw7.fsf@handshake.de> Rahul Sircar writes: > So I recently tried to write a script using urllib2 module. > Here is the code below: > import urllib2 > file = 'metasploitable-linux-2.0.0.zip' > url='https://downloads.sourceforge.net/project/metasploitable/Metasploitable2/metasploitable-linux-2.0.0.zip' > response = urllib2.urlopen(url) > fh=open(file,'w') > fh.write(response.read()) > fh.close() > > I am getting this error in the output. > Traceback (most recent call last): > ... > v = self._sslobj.read(len) > socket.error: [Errno 10053] An established connection was aborted by the software in your host machine. This error tells you that the server (i.e. "downloads.sourcefourge.net") has closed the connection. This can happen occationally. Should the error persist, something may be wrong/special with your network/firewall. You may try an alternative way to download the file. On my "*nix" platform, I would use the command "wget" for a download trial. You could also try a browser. From rustompmody at gmail.com Wed Jul 26 02:50:57 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Tue, 25 Jul 2017 23:50:57 -0700 (PDT) Subject: I am new here and i need your help please In-Reply-To: References: <87h8xz4s5r.fsf@handshake.de> Message-ID: <5a55559f-3993-483c-8a13-40675dfea66b@googlegroups.com> On Wednesday, July 26, 2017 at 12:11:27 PM UTC+5:30, dieter wrote: > yasirrbadamasi: > > > I have never execute any program before using python and a task was given to me by my teacher > > I suggest to start by reading the Python tutorial: > "https://docs.python.org/3/tutorial/index.html". I was about to say that? Some additions: You will also need to have python installed on your computer. Have you done that? Do you need help with that? Start here and if any issue report with details including your OS (windows/linux/Mac): https://www.python.org/downloads/ After that, for a beginner below list may be more helpful than this one [recommended]: https://mail.python.org/mailman/listinfo/tutor From nomail at com.invalid Wed Jul 26 02:58:03 2017 From: nomail at com.invalid (ast) Date: Wed, 26 Jul 2017 08:58:03 +0200 Subject: Reading a random element from a set Message-ID: <59783d7e$0$15349$426a74cc@news.free.fr> Hello random.choice on a set doesn't work because sets are not indexable so I found nothing better than taking an element and puting it back a = {5, 7, 8, 3, 0, 8, 1, 15, 16, 34, 765443} elt = a.pop() a.add(elt) any better idea, in a single instruction ? From steve+comp.lang.python at pearwood.info Wed Jul 26 03:16:59 2017 From: steve+comp.lang.python at pearwood.info (Steven D'Aprano) Date: 26 Jul 2017 07:16:59 GMT Subject: Reading a random element from a set References: <59783d7e$0$15349$426a74cc@news.free.fr> Message-ID: <597841eb$0$2878$c3e8da3$76491128@news.astraweb.com> On Wed, 26 Jul 2017 08:58:03 +0200, ast wrote: > Hello > > random.choice on a set doesn't work because sets are not indexable > > so I found nothing better than taking an element and puting it back > > a = {5, 7, 8, 3, 0, 8, 1, 15, 16, 34, 765443} > elt = a.pop() > a.add(elt) That's not *random*, it is arbitrary but predictable. If you try it twice in a row, you will probably get the same element each time. (Depends on the elements, but very likely.) >>> a = {5, 7, 8, 3, 0, 8, 1, 15, 16, 34, 765443} >>> elt = a.pop() >>> a.add(elt) >>> elt 0 >>> a.pop() # likely to be zero again 0 > any better idea, in a single instruction ? If you know the set is not empty, you can do: element = next(iter(a)) but that has the same problem as above: the choice won't be random, and each time you do it you'll get the same element: >>> next(iter(a)) # after popping zero and not adding it again 1 >>> next(iter(a)) 1 If you need a random element, best way is: >>> random.choice(list(a)) 8 >>> random.choice(list(a)) 15 >>> random.choice(list(a)) 16 -- ?You are deluded if you think software engineers who can't write operating systems or applications without security holes, can write virtualization layers without security holes.? ?Theo de Raadt From mail at timgolden.me.uk Wed Jul 26 03:28:50 2017 From: mail at timgolden.me.uk (Tim Golden) Date: Wed, 26 Jul 2017 08:28:50 +0100 Subject: Recent Spam problem In-Reply-To: References: Message-ID: <73bf4184-4b2c-0c48-3394-48ed7c83b513@timgolden.me.uk> On 25/07/2017 06:13, Rustom Mody wrote: > Of late there has been an explosion of spam > Thought it was only a google-groups (USENET?) issue and would be barred from the mailing list. > > But then find its there in the mailing list archives as well > Typical example: https://mail.python.org/pipermail/python-list/2017-July/724085.html > > What gives?? Well, just for clarification: the spam measures on the list are at least partly manual. In this case, the spammer started to use a different address from the one we were trapping so it took until one of us (me, in this case) spotted the incoming spam before we were able to block it. I almost never look at the GG mirror (or Usenet) so it wasn't until the post which started this thread that I realised just how much spam is being thrown at the newsgroup. In case it wasn't clear to anyone: GG is actually a gateway to comp.lang.python which presents itself as a mailing list, while gmane (while it's still running) is a gateway to the python-list which presents itself as a newsgroup. TJG From nomail at com.invalid Wed Jul 26 03:49:22 2017 From: nomail at com.invalid (ast) Date: Wed, 26 Jul 2017 09:49:22 +0200 Subject: Reading a random element from a set In-Reply-To: <597841eb$0$2878$c3e8da3$76491128@news.astraweb.com> References: <59783d7e$0$15349$426a74cc@news.free.fr> <597841eb$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: <59784986$0$15330$426a74cc@news.free.fr> "Steven D'Aprano" a ?crit dans le message de news:597841eb$0$2878$c3e8da3$76491128 at news.astraweb.com... > On Wed, 26 Jul 2017 08:58:03 +0200, ast wrote: > >> Hello >> >> random.choice on a set doesn't work because sets are not indexable >> >> so I found nothing better than taking an element and puting it back >> >> a = {5, 7, 8, 3, 0, 8, 1, 15, 16, 34, 765443} >> elt = a.pop() >> a.add(elt) > > > That's not *random*, it is arbitrary but predictable. If you try it twice > in a row, you will probably get the same element each time. (Depends on > the elements, but very likely.) > > If you need a random element, best way is: > > > >>>> random.choice(list(a)) > 8 >>>> random.choice(list(a)) > 15 >>>> random.choice(list(a)) > 16 > Thanks random.sample(a, 1)[0] works too From __peter__ at web.de Wed Jul 26 03:54:56 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 26 Jul 2017 09:54:56 +0200 Subject: Reading a random element from a set References: <59783d7e$0$15349$426a74cc@news.free.fr> Message-ID: ast wrote: > Hello > > random.choice on a set doesn't work because sets are > not indexable > > so I found nothing better than taking an element and > puting it back > > a = {5, 7, 8, 3, 0, 8, 1, 15, 16, 34, 765443} > elt = a.pop() > a.add(elt) > > any better idea, in a single instruction ? >>> next(itertools.islice(a, random.randrange(len(a)), None)) 0 While this gives you a random element I agree that it's not nice in every other aspect. From kunal123jamdade at gmail.com Wed Jul 26 07:52:47 2017 From: kunal123jamdade at gmail.com (Kunal Jamdade) Date: Wed, 26 Jul 2017 17:22:47 +0530 Subject: Regular expression Message-ID: There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . I want to extract the last 4 characters. I tried different regex. but i am not getting it right. Can anyone suggest me how should i proceed.? Regards, Kunal From johann.spies at gmail.com Wed Jul 26 08:00:01 2017 From: johann.spies at gmail.com (Johann Spies) Date: Wed, 26 Jul 2017 14:00:01 +0200 Subject: Regular expression In-Reply-To: References: Message-ID: On 26 July 2017 at 13:52, Kunal Jamdade wrote: > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > I want to extract the last 4 characters. I tried different regex. but i am > not getting it right. > > Can anyone suggest me how should i proceed.? What have you tried? Why do you need regular expression? >>> s = 'first-324-True-rms-kjhg-Meterc639.html' >>> s[-4:] 'html' Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) From paul.james.barry at gmail.com Wed Jul 26 08:05:23 2017 From: paul.james.barry at gmail.com (Paul Barry) Date: Wed, 26 Jul 2017 13:05:23 +0100 Subject: Regular expression In-Reply-To: References: Message-ID: Is this what you are after? *>>> *data = 'first-324-True-rms-kjhg-Meterc639.html' *>>> *extension = data.find('.html') *>>> *extension 33 *>>> *data[extension-4:extension] 'c639' On 26 July 2017 at 13:00, Johann Spies wrote: > On 26 July 2017 at 13:52, Kunal Jamdade wrote: > > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > > > I want to extract the last 4 characters. I tried different regex. but i > am > > not getting it right. > > > > Can anyone suggest me how should i proceed.? > > What have you tried? > > Why do you need regular expression? > > >>> s = 'first-324-True-rms-kjhg-Meterc639.html' > > >>> s[-4:] > 'html' > > Regards > Johann > -- > Because experiencing your loyal love is better than life itself, > my lips will praise you. (Psalm 63:3) > -- > https://mail.python.org/mailman/listinfo/python-list > -- Paul Barry, t: @barrypj - w: http://paulbarry.itcarlow.ie - e: paul.barry at itcarlow.ie Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. From gbs.deadeye at gmail.com Wed Jul 26 08:09:45 2017 From: gbs.deadeye at gmail.com (=?UTF-8?Q?Andre_M=C3=BCller?=) Date: Wed, 26 Jul 2017 12:09:45 +0000 Subject: Regular expression In-Reply-To: References: Message-ID: fname = 'first-324-True-rms-kjhg-Meterc639.html' # with string manipulation stem, suffix = fname.rsplit('.', 1) print(stem[-4:]) # oo-style with str manipulation import pathlib path = pathlib.Path(fname) print(path.stem[-4:]) From __peter__ at web.de Wed Jul 26 08:13:23 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 26 Jul 2017 14:13:23 +0200 Subject: Regular expression References: Message-ID: Kunal Jamdade wrote: > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > I want to extract the last 4 characters. I tried different regex. but i am > not getting it right. > > Can anyone suggest me how should i proceed.? You don't need a regular expression: >>> import os >>> name = 'first-324-True-rms-kjhg-Meterc639.html' >>> bname, ext = os.path.splitext(name) >>> bname 'first-324-True-rms-kjhg-Meterc639' >>> ext '.html' >>> bname[-4:] 'c639' >>> bname[:-4] 'first-324-True-rms-kjhg-Meter' >>> bname[:-4] + ext 'first-324-True-rms-kjhg-Meter.html' Canned into a function: >>> def strip_last_four(filename): ... path, name = os.path.split(filename) ... bname, ext = os.path.splitext(name) ... return os.path.join(path, bname[:-4] + ext) ... >>> strip_last_four("/foo/bar/baz1234.html") '/foo/bar/baz.html' >>> strip_last_four("/foo/bar/1234.html") '/foo/bar/.html' >>> strip_last_four("/foo/bar/bar.html") '/foo/bar/.html' From jussi.piitulainen at helsinki.fi Wed Jul 26 08:21:29 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Wed, 26 Jul 2017 15:21:29 +0300 Subject: Regular expression References: Message-ID: Kunal Jamdade writes: > There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' . > > I want to extract the last 4 characters. I tried different regex. but > i am not getting it right. > > Can anyone suggest me how should i proceed.? os.path.splitext(name) # most likely; also: os.path.split, os.path.join name.rsplit('.', 1) # might do name[-4:] # "last 4 characters" name.endswith('.html') # is this what you really want? This is not a regex job. From monica.snow1 at gmail.com Wed Jul 26 11:05:52 2017 From: monica.snow1 at gmail.com (monica.snow1 at gmail.com) Date: Wed, 26 Jul 2017 08:05:52 -0700 (PDT) Subject: Basic python understanding Message-ID: Hi I am in need some understanding on how to become more knowledgeable while interviewing a candidate that requires Python and other (see below) experience for a position with Mass Mutual as Developer, Systems Design Engineer, Web Engineer Director, Web Engineer Consultant, and Full Stack Developer. What would be some questions and answers so I gain a strong understanding of my candidate that has Python experience? I preform the initial screen about 30 mins then pass them along to the hiring manager. I want to be able to communicate on a more technical level and show appreciation for his/her skill set. Other requirements: Node, java, javascript, ruby, MVC (Model-view-controller) frameworks, object modeling, database systems, jave-Swing and/or GWT Much respect, Monica 941-212-9085 From __peter__ at web.de Wed Jul 26 12:36:12 2017 From: __peter__ at web.de (Peter Otten) Date: Wed, 26 Jul 2017 18:36:12 +0200 Subject: Basic python understanding References: Message-ID: monica.snow1 at gmail.com wrote: > Hi I am in need some understanding on how to become more knowledgeable > while interviewing a candidate that requires Python and other (see below) > experience for a position with Mass Mutual as Developer, Systems Design > Engineer, Web Engineer Director, Web Engineer Consultant, and Full Stack > Developer. > > What would be some questions and answers so I gain a strong understanding > of my candidate that has Python experience? A short but instructive video tutorial: https://www.youtube.com/watch?v=uio1J2PKzLI > I preform the initial screen about 30 mins then pass them along to the > hiring manager. I want to be able to communicate on a more technical > level and show appreciation for his/her skill set. > > Other requirements: Node, java, javascript, ruby, MVC > (Model-view-controller) frameworks, object modeling, database systems, > jave-Swing and/or GWT https://pdos.csail.mit.edu/archive/scigen/rooter.pdf covers that and more. From irmen.NOSPAM at xs4all.nl Wed Jul 26 13:36:49 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Wed, 26 Jul 2017 19:36:49 +0200 Subject: zipapp should not include temporary files? Message-ID: <5978d330$0$834$e4fe514c@news.xs4all.nl> Hi, when creating an executable zip file using the zipapp module, it's a little sad to see that no effort is done to filter out obvious temporary files: the resulting zipfile contains any *.pyc/pyo files and other things such as .git, .tox, .tmp folders. The documentation says "zip is created from the contents of the directory" so strictly speaking it's not wrong that it is doing this. However I think it is inconvenient, because we either have to clean out the directory manually first before zipping it, or afterwards, remove stuff from the resulting zipfile. What do you think? Should the zipapp module perhaps be improved to automatically skip obvious temporary files or perhaps allow to provide a filter function? Irmen From kryptxy at protonmail.com Wed Jul 26 14:06:42 2017 From: kryptxy at protonmail.com (Kryptxy) Date: Wed, 26 Jul 2017 14:06:42 -0400 Subject: Will my project be accepted in pypi? Message-ID: Hello, I have built a command-line torrent fetching tool. The tool fetches torrents from thepiratebay proxy sites, and display results in console. Its written in python3, and is completely open-source. Project link - https://github.com/kryptxy/torrench (You may give it a try :p) Question: (a) Can I publish such tool on pypi? (b) Also, should I publish it on pypi? I ask (b) as the packages I used from pypi (pip) were more like utility packages. They helped me in building my project. My project, on the other end is a full-fledged tool. So should I publish it on pypi? I ask (a) because the tool scraps TPB proxies, which aren't legal per se. So if I package and publish this tool, will it be accepted? If someone could assist me with this? Thank you. Regards, Rijul Gulati From jladasky at itu.edu Wed Jul 26 15:42:38 2017 From: jladasky at itu.edu (jladasky at itu.edu) Date: Wed, 26 Jul 2017 12:42:38 -0700 (PDT) Subject: Basic python understanding In-Reply-To: References: Message-ID: <8107abf1-69a0-484a-8c5d-ece5213cf7c7@googlegroups.com> On Wednesday, July 26, 2017 at 8:06:19 AM UTC-7, Monica Snow wrote: > Hi I am in need some understanding on how to become more knowledgeable while interviewing a candidate that requires Python and other (see below) experience... I just want to jump in to say thank you, Ms. Snow, for making an effort that far too few people in recruiting and HR seem to make. I hope that your preparations lead to good candidates for the job. From amorawski at magna-power.com Wed Jul 26 16:02:23 2017 From: amorawski at magna-power.com (Adam M) Date: Wed, 26 Jul 2017 13:02:23 -0700 (PDT) Subject: Basic python understanding In-Reply-To: References: Message-ID: On Wednesday, July 26, 2017 at 11:06:19 AM UTC-4, Monica Snow wrote: > Hi I am in need some understanding on how to become more knowledgeable while interviewing a candidate that requires Python and other (see below) experience for a position with Mass Mutual as Developer, Systems Design Engineer, Web Engineer Director, Web Engineer Consultant, and Full Stack Developer. > > What would be some questions and answers so I gain a strong understanding of my candidate that has Python experience? > > I preform the initial screen about 30 mins then pass them along to the hiring manager. I want to be able to communicate on a more technical level and show appreciation for his/her skill set. > > Other requirements: Node, java, javascript, ruby, MVC (Model-view-controller) frameworks, object modeling, database systems, jave-Swing and/or GWT > > Much respect, > > Monica > 941-212-9085 You can try these websites: https://www.toptal.com/python/interview-questions https://devskiller.com/screen-python-developers-skills-find-best-guide-recruitment/ Regards Adam M. From python at mrabarnett.plus.com Wed Jul 26 16:44:59 2017 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 26 Jul 2017 21:44:59 +0100 Subject: Basic python understanding In-Reply-To: References: Message-ID: <8ffbcae2-733c-f4f5-6b13-003a29806877@mrabarnett.plus.com> On 2017-07-26 20:04, Stefan Ram wrote: > monica.snow1 at gmail.com writes: >>Hi I am in need some understanding on how to become more >>knowledgeable while interviewing a candidate that requires >>Python > > The only noun preceding "that" is "candidate". So, are you > using "that" to refer to the candidate? > The word "that" is being used here as a relative pronoun. It's not wrong, although when referring to people it would be more common to use the word "who". (It's not strictly true that it's the _candidate_ who requires Python; it's the _company_ that has the requirement for the candidate to know Python.) [snip] From p.f.moore at gmail.com Wed Jul 26 18:03:07 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 26 Jul 2017 15:03:07 -0700 (PDT) Subject: zipapp should not include temporary files? In-Reply-To: <5978d330$0$834$e4fe514c@news.xs4all.nl> References: <5978d330$0$834$e4fe514c@news.xs4all.nl> Message-ID: <9c536053-b7a8-4829-92f3-6099e84a726d@googlegroups.com> On Wednesday, 26 July 2017 18:37:15 UTC+1, Irmen de Jong wrote: > when creating an executable zip file using the zipapp module, it's a little sad to see > that no effort is done to filter out obvious temporary files: the resulting zipfile > contains any *.pyc/pyo files and other things such as .git, .tox, .tmp folders. > > The documentation says "zip is created from the contents of the directory" so strictly > speaking it's not wrong that it is doing this. However I think it is inconvenient, > because we either have to clean out the directory manually first before zipping it, or > afterwards, remove stuff from the resulting zipfile. > > What do you think? Should the zipapp module perhaps be improved to automatically skip > obvious temporary files or perhaps allow to provide a filter function? Well, the expected usage of zipapp is to prepare a clean distribution of your application and then zip it up. You're not really expected to just zip a working environment. But I guess I can see that you might do some testing after preparing the staging directory for zipping, and that might leave unwanted files around. Allowing for a filter function seems like a reasonable suggestion. I'm a little less comfortable with guessing what's "obviously" temporary, as it's too easy to get such a judgement wrong. For example, why do you consider .pyc files as "temporary"? Someone might want to precompile the contents of a zipapp. If you want to create a feature request for a filter function on bugs.python.org and assign it to me, I'll take a look at it. It would only be available from the Python API, though, and only if the source is a directory (not if it's an existing zipfile, as those are simply copied as binary data). I don't think it's worth trying to design a command line API for this - parsing a "what to exclude" spec would significantly increase the complexity of the module for limited benefit. Paul. From yasirrbadamasi at gmail.com Wed Jul 26 18:05:11 2017 From: yasirrbadamasi at gmail.com (yasirrbadamasi at gmail.com) Date: Wed, 26 Jul 2017 15:05:11 -0700 (PDT) Subject: I am new here and i need your help please In-Reply-To: <5a55559f-3993-483c-8a13-40675dfea66b@googlegroups.com> References: <87h8xz4s5r.fsf@handshake.de> <5a55559f-3993-483c-8a13-40675dfea66b@googlegroups.com> Message-ID: <1448de37-73c2-4c11-b1b9-6c820e402913@googlegroups.com> Thank you for your responses, i really appreciate From rantingrickjohnson at gmail.com Wed Jul 26 18:42:20 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Wed, 26 Jul 2017 15:42:20 -0700 (PDT) Subject: Basic python understanding In-Reply-To: References: <8ffbcae2-733c-f4f5-6b13-003a29806877@mrabarnett.plus.com> Message-ID: <7fa043d7-ebf7-4220-94e9-239489880c25@googlegroups.com> On Wednesday, July 26, 2017 at 3:45:35 PM UTC-5, MRAB wrote: > On 2017-07-26 20:04, Stefan Ram wrote: > > monica.snow1 at gmail.com writes: > > > Hi I am in need some understanding on how to become more > > > knowledgeable while interviewing a candidate that > > > requires Python > > > > The only noun preceding "that" is "candidate". So, are you > > using "that" to refer to the candidate? > > > The word "that" is being used here as a relative pronoun. > It's not wrong, although when referring to people it would > be more common to use the word "who". (It's not strictly > true that it's the _candidate_ who requires Python; it's > the _company_ that has the requirement for the candidate to > know Python.) I don't think it is appropriate for us to criticize someone's sentence structure who is: (1) not a prolific contributor here, (2) not a spammer or troll, or (3) is politely asking for help. Any one of these excuses are acceptable _if_ the OP does not make a habit of composing these clumsy inquiries. Obviously, the sentence could have been structured more wisely, but perhaps the OP was in a hurry; or nervous; or perhaps her first language is not English. Of course, there is another scenario which can cause an otherwise competent communicator to sound as though they have a weak grasp of the english language, and that scenario is when too much mental effort is focused on the mechanical process of typing a message. For instance: "I try to ask good question, but i spend so much time thinking about fingers, that forsake sentence structure, i do" So if we are to criticize, we should at least offer benevolent solutions. Eh? "I am in need some understanding on how to become more knowledgeable while interviewing a candidate that requires Python" Yes. There are some flaws here. One of the most glaring is the impracticality of an interviewer "becoming more knowledgable while interviewing"[1]. The other, as Stefan pointed out, is the use of "that" as a pronoun. And while i dare not hazard a guess as to the impetus of the "impracticality element", i believe the word "that" was not meant as a pronoun (aka: the interviewee), but was a reference to the "interview process" itself. In any event, if Monica would be so kind as to allow me to paraphrase her intent[2], i would like to offer this slightly improved version, of which, i am rather fond. Hi everyone. I'm currently employed in the HR department of my firm, and, when i'm not making coffee for the execs, one of my job duties require that i screen applicants. Specifically, I have been asked to weed-out unqualified applicants before they move onto the "official interview" with my superiors, who, unsurprisingly enough, have absolutely no patience for unqualified bozos. But i digress... Only problem is, my knowledge in the specific prerequisites of: Python, Node, Java, Javascript, Ruby, MVC (Model-view- controller) Frameworks, -- ?_? -- Object Modeling, Database Systems, Java-Swing and/or GWT -- *DEEP BREATH* -- is severly lacking. So my question is: Are there any resources that you can recommend for which i can become qualified enough to accept or deny these prospects? Well, Monica, the answer to your question *IS* a question: How "qualified" do you want to be? If you think you can watch one or two five minute YouTube vids and adsorb enough knowledge to become qualified to make these difficult technical judgments, then, i would say you, and more disturbingly your superiors, grossly underestimate the depth of knowledge required here. If i were to hazard guess, i would say that by entrusting a non-technical person to screen the applications of highly technical persons, your superiors either (1) really don't want to hire anyone, or (2) want to pressure you into quitting. PS: I do apologize if my frank style of speaking is offensive to you. Please understand that i mean no offense. And although i doubt that a neophyte can master these subjects in a timely manner, i am in no way suggesting that you could not master them if you truly wanted to. [1] Which is like OJT on steroids! [2] And don't you worry Monica[3], i'm a _professional_ impersonator. O;-) [3] You gorgeous devil, you. From rosuav at gmail.com Wed Jul 26 19:12:33 2017 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 27 Jul 2017 09:12:33 +1000 Subject: Will my project be accepted in pypi? In-Reply-To: References: Message-ID: On Thu, Jul 27, 2017 at 4:06 AM, Kryptxy via Python-list wrote: > Hello, > I have built a command-line torrent fetching tool. The tool fetches torrents from thepiratebay proxy sites, and display results in console. Its written in python3, and is completely open-source. > > Project link - https://github.com/kryptxy/torrench > (You may give it a try :p) > > Question: (a) Can I publish such tool on pypi? > (b) Also, should I publish it on pypi? > > I ask (b) as the packages I used from pypi (pip) were more like utility packages. They helped me in building my project. My project, on the other end is a full-fledged tool. So should I publish it on pypi? > Yes, absolutely. There's nothing wrong with tools being on PyPI; for instance, youtube-dl is available there. > I ask (a) because the tool scraps TPB proxies, which aren't legal per se. So if I package and publish this tool, will it be accepted? Personally, I'd go ahead and publish it. Torrenting is not illegal any more than photocopying is; it's what you do with it that can be illegal. (Unless the proxies themselves are violating TPB's TOS or something - is that what you mean?) As long as your code isn't ITSELF doing anything illegal, it should be fine. ChrisA From jeremiah.dodds at gmail.com Wed Jul 26 20:50:23 2017 From: jeremiah.dodds at gmail.com (Jeremiah Dodds) Date: Wed, 26 Jul 2017 20:50:23 -0400 Subject: Basic python understanding In-Reply-To: (monica's message of "Wed, 26 Jul 2017 08:05:52 -0700 (PDT)") References: Message-ID: <87h8xyg0ts.fsf@gmail.com> monica.snow1 at gmail.com writes: > What would be some questions and answers so I gain a strong > understanding of my candidate that has Python experience? In addition to the resources others have pointed you at, it's worth mentioning that it can be very hard to gauge experience past the basics if you don't have it yourself. If you currently have python developers at your company, it's worth considering having one of them sit in on the interview or review the candidate with you -- you'll be able to get a more accurate read on the experience level of the candidate, and I'd imagine it'd be a help at getting familiar with the types of things that you can be looking for that indicate a good hire. From soyeomul at doraji.xyz Wed Jul 26 21:03:29 2017 From: soyeomul at doraji.xyz (Byung-Hee HWANG =?utf-8?B?KO2Zqeuzke2drCwg6buD?= =?utf-8?B?54Kz54aZKQ==?=) Date: Thu, 27 Jul 2017 10:03:29 +0900 Subject: python in chromebook Message-ID: my computer is chromebook. how can i install python in chromebook? barely i did meet develop mode of chromebook. also i'm new to python. INDEED, i want to make python code on my chromebook. thanks in avance!!! -- ^????? _????_ ?????_^))// From soyeomul at yw.doraji.xyz Wed Jul 26 22:04:06 2017 From: soyeomul at yw.doraji.xyz (Byung-Hee HWANG =?utf-8?B?KO2Zqeuzke2drCwg?= =?utf-8?B?6buD54Kz54aZKQ==?=) Date: Thu, 27 Jul 2017 11:04:06 +0900 Subject: python install in chromebook Message-ID: firstly i did fail to send message via mailing list so i try again with usenet here comp.lang.python. i want to install python in chromebook because i have chromebook and i want to make code of python. somebody could help me, i believe. thanks in advance... -- ^????? _????_ ?????_^))// From robertvstepp at gmail.com Wed Jul 26 22:35:22 2017 From: robertvstepp at gmail.com (boB Stepp) Date: Wed, 26 Jul 2017 21:35:22 -0500 Subject: python in chromebook In-Reply-To: References: Message-ID: On Wed, Jul 26, 2017 at 8:03 PM, Byung-Hee HWANG (???, ???) wrote: > my computer is chromebook. how can i install python in chromebook? > barely i did meet develop mode of chromebook. also i'm new to > python. > > INDEED, i want to make python code on my chromebook. > Googling for "python on chromebook" tends to bring up Python 2-slanted info; searching for "python 3 on chromebook" for Python 3-slanted results should give you useful info. A quick scan suggests there are two approaches: (1) Going into developer mode on your Chromebook and installing Python normally, or (2) going to the app store and installing a Python shell that runs in your browser. Myself, I would prefer (1), but I have never had a Chromebook, so that may not fit in with what you wish to do. HTH! -- boB From soyeomul at doraji.xyz Thu Jul 27 00:10:06 2017 From: soyeomul at doraji.xyz (Byung-Hee HWANG =?utf-8?B?KO2Zqeuzke2drCwg6buD?= =?utf-8?B?54Kz54aZKQ==?=) Date: Thu, 27 Jul 2017 13:10:06 +0900 Subject: python in chromebook References: Message-ID: boB Stepp ?? ???, ??? ?? ???: > Googling for "python on chromebook" tends to bring up Python 2-slanted > info; searching for "python 3 on chromebook" for Python 3-slanted > results should give you useful info. A quick scan suggests there are > two approaches: (1) Going into developer mode on your Chromebook and > installing Python normally, or (2) going to the app store and > installing a Python shell that runs in your browser. Myself, I would > prefer (1), but I have never had a Chromebook, so that may not fit in > with what you wish to do. OK, i will try it, thanks!!! -- ^????? _????_ ?????_^))// From breamoreboy at gmail.com Thu Jul 27 01:12:50 2017 From: breamoreboy at gmail.com (breamoreboy at gmail.com) Date: Wed, 26 Jul 2017 22:12:50 -0700 (PDT) Subject: Recent Spam problem In-Reply-To: References: <73bf4184-4b2c-0c48-3394-48ed7c83b513@timgolden.me.uk> Message-ID: On Wednesday, July 26, 2017 at 8:29:07 AM UTC+1, Tim Golden wrote: > On 25/07/2017 06:13, Rustom Mody wrote: > > Of late there has been an explosion of spam > > Thought it was only a google-groups (USENET?) issue and would be barred from the mailing list. > > > > But then find its there in the mailing list archives as well > > Typical example: https://mail.python.org/pipermail/python-list/2017-July/724085.html > > > > What gives?? > > I almost never look at the GG mirror (or Usenet) so it wasn't until the > post which started this thread that I realised just how much spam is > being thrown at the newsgroup. > Hence why I asked a couple of weeks back why we don't just bin the existing group and start afresh with a new, properly moderated group. I suggest a really innovative name like python-users :) Kindest regards. Mark Lawrence. From no.email at nospam.invalid Thu Jul 27 01:38:25 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Wed, 26 Jul 2017 22:38:25 -0700 Subject: OT was Re: Python 3 removes name binding from outer scope References: <85379l5m3h.fsf@benfinney.id.au> <5976AE89.1090701@stoneleaf.us> <85tw213u05.fsf@benfinney.id.au> <19afd3e8-bccf-4439-817b-31428e7ef59c@googlegroups.com> <59770242$0$2878$c3e8da3$76491128@news.astraweb.com> Message-ID: <874ltyh226.fsf@nightsong.com> Chris Angelico writes: >>> Bipartisan-US-Bill-Moves-to-Criminalize-BDS-Support-20170720-0001.html >> Heh, at first I read that as a bill to criminalise BSD support :-) > I spluttered my drink on reading that. Good job Steven! https://en.wikipedia.org/wiki/BDS_C From greg.ewing at canterbury.ac.nz Thu Jul 27 02:31:06 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Thu, 27 Jul 2017 18:31:06 +1200 Subject: Basic python understanding In-Reply-To: References: <87h8xyg0ts.fsf@gmail.com> Message-ID: I'd like to add that what you should really be looking for is not a Python programmer as such, but simply a good, competent programmer. Any decent programmer will be able to quickly pick up what they need to know about Python on the job. If they can't, then they're not good enough, and you shouldn't hire them. The same goes for any of the other technology buzzwords on your list. Jeremiah Dodds wrote: > it's worth > mentioning that it can be very hard to gauge experience past the basics > if you don't have it yourself. More than that, I'd say it's impossible. So I second the recommendation to involve one of your existing experienced programmers in the interviewing. It will be difficult even for them to judge the candidate's competence, but at least they'll have a chance. -- Greg From greg.ewing at canterbury.ac.nz Thu Jul 27 02:52:33 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Thu, 27 Jul 2017 18:52:33 +1200 Subject: Recent Spam problem In-Reply-To: References: <73bf4184-4b2c-0c48-3394-48ed7c83b513@timgolden.me.uk> Message-ID: breamoreboy at gmail.com wrote: > Hence why I asked a couple of weeks back why we don't just bin the existing > group and start afresh with a new, properly moderated group. Someone would need to volunteer to be the moderator. Also, moderation is something of a consenting-adults thing on usenet. It's not hard for a sufficently knowledgeable person to bypass the moderation. (I'm told there's a security-related usenet group that takes advantage of this. It's a moderated group with no moderator; if you know enough to fake the approval, you're considered entitled to post. :-) -- Greg From sonnichs at gmail.com Thu Jul 27 07:40:40 2017 From: sonnichs at gmail.com (FS) Date: Thu, 27 Jul 2017 04:40:40 -0700 (PDT) Subject: Installing matplotlib on python3 Message-ID: <1813406c-e2bc-4efd-b8ba-eeec5a3666bb@googlegroups.com> I just installed matplotlib on debian and I tried to import it on python3. It cannot be found however it can be found on python 2.x. No surprise: A 'find -name matplotliib' reveals: /usr/share/matplotlib /usr/lib/python2.7/dist-packages/matplotlib I am not sure how the apt-get elected to place matplotlib in the python2.7 directory but I want to "properly" install it so it can import under python3. >There are probably commands from python3 to point its import to the 2.7 directory, but I expect that is just a workaround and I am uneasy about whether I have somehow installed a 2.7 compatible version only of matplotlib. >Possibly there is some recommended way to re-install matplotlib for python3 sending it to the appropriate directories. Any advice here? thanks fritz From jussi.piitulainen at helsinki.fi Thu Jul 27 07:48:06 2017 From: jussi.piitulainen at helsinki.fi (Jussi Piitulainen) Date: Thu, 27 Jul 2017 14:48:06 +0300 Subject: Installing matplotlib on python3 References: <1813406c-e2bc-4efd-b8ba-eeec5a3666bb@googlegroups.com> Message-ID: FS writes: > I just installed matplotlib on debian and I tried to import it on > python3. It cannot be found however it can be found on python 2.x. No > surprise: > A 'find -name matplotliib' reveals: > /usr/share/matplotlib > /usr/lib/python2.7/dist-packages/matplotlib > > I am not sure how the apt-get elected to place matplotlib in the > python2.7 directory but I want to "properly" install it so it can > import under python3. > >>There are probably commands from python3 to point its import to the >> 2.7 directory, but I expect that is just a workaround and I am >> uneasy about whether I have somehow installed a 2.7 compatible >> version only of matplotlib. >>Possibly there is some recommended way to re-install matplotlib for >>python3 sending it to the appropriate directories. > > Any advice here? You may find it's a different package, one named python3-* instead of python-*: $ apt-cache search matplotlib idl - Interactive Data Language IDL python-matplotlib - Python based plotting system in a style similar to Matlab python-matplotlib-data - Python based plotting system (data package) python-matplotlib-dbg - Python based plotting system (debug extension) python-matplotlib-doc - Python based plotting system (documentation package) python-mpltoolkits.basemap - matplotlib toolkit to plot on map projections python-mpltoolkits.basemap-data - matplotlib toolkit to plot on map projections (data package) python-mpltoolkits.basemap-doc - matplotlib toolkit to plot on map projections (documentation) python-mpmath - library for arbitrary-precision floating-point arithmetic python-mpmath-doc - library for arbitrary-precision floating-point arithmetic - Documentation python-scitools - Python library for scientific computing python-wxmpl - Painless matplotlib embedding in wxPython python3-matplotlib - Python based plotting system in a style similar to Matlab (Python 3) python3-matplotlib-dbg - Python based plotting system (debug extension, Python 3) python3-mpmath - library for arbitrary-precision floating-point arithmetic (Python3) $ From alister.ware at ntlworld.com Thu Jul 27 07:48:58 2017 From: alister.ware at ntlworld.com (alister) Date: Thu, 27 Jul 2017 11:48:58 GMT Subject: Installing matplotlib on python3 References: <1813406c-e2bc-4efd-b8ba-eeec5a3666bb@googlegroups.com> Message-ID: On Thu, 27 Jul 2017 04:40:40 -0700, FS wrote: > I just installed matplotlib on debian and I tried to import it on > python3. It cannot be found however it can be found on python 2.x. No > surprise: > A 'find -name matplotliib' reveals: > /usr/share/matplotlib /usr/lib/python2.7/dist-packages/matplotlib > > I am not sure how the apt-get elected to place matplotlib in the > python2.7 directory but I want to "properly" install it so it can import > under python3. > >>There are probably commands from python3 to point its import to the 2.7 >>directory, but I expect that is just a workaround and I am uneasy about >>whether I have somehow installed a 2.7 compatible version only of >>matplotlib. >>Possibly there is some recommended way to re-install matplotlib for >>python3 sending it to the appropriate directories. > > Any advice here? > thanks fritz yes please do not top post -- My face is new, my license is expired, and I'm under a doctor's care!!!! From darcy at VybeNetworks.com Thu Jul 27 08:24:16 2017 From: darcy at VybeNetworks.com (D'Arcy Cain) Date: Thu, 27 Jul 2017 08:24:16 -0400 Subject: Basic python understanding In-Reply-To: References: <87h8xyg0ts.fsf@gmail.com> Message-ID: <0eebb819-9ab3-0a21-bac0-5498c598506e@VybeNetworks.com> On 07/27/2017 02:31 AM, Gregory Ewing wrote: > I'd like to add that what you should really be looking for is > not a Python programmer as such, but simply a good, competent > programmer. > > Any decent programmer will be able to quickly pick up what > they need to know about Python on the job. If they can't, > then they're not good enough, and you shouldn't hire them. I'll second that. I once had to build a team of Python developers for a major project. The pool of actual Python programmers was small so we just advertised for programmers. In the interviews we used a test that used C to determine their problem solving skills. We also looked for new grads so that they didn't have to un-learn a bunch of stuff. We wound up with an amazing team that managed to build the project in record time. Lesson: Look for programmers, not Python (or Perl or C or C++ or Java or...) programmers. -- D'Arcy J.M. Cain Vybe Networks Inc. http://www.VybeNetworks.com/ IM:darcy at Vex.Net VoIP: sip:darcy at VybeNetworks.com From t.ueda0820 at gmail.com Thu Jul 27 09:14:02 2017 From: t.ueda0820 at gmail.com (t.ueda0820 at gmail.com) Date: Thu, 27 Jul 2017 06:14:02 -0700 (PDT) Subject: Case Solution: The Los Angeles Philharmonic Orchestra Cultural Entrepreneurship by Rohit Deshpande, Annelena Lobb In-Reply-To: <80a6bee4-69d4-490a-8196-7d64d1f37f76@googlegroups.com> References: <80a6bee4-69d4-490a-8196-7d64d1f37f76@googlegroups.com> Message-ID: On Saturday, July 1, 2017 at 10:56:39 PM UTC+3, Case Solution & Analysis wrote: > Case Solution and Analysis of The Los Angeles Philharmonic Orchestra: Cultural Entrepreneurship by Rohit Deshpande, Annelena Lobb, send email to casesolutionscentre(at)gmail(dot)com > > Case Study ID: 9-517-006 > > Get Case Study Solution and Analysis of The Los Angeles Philharmonic Orchestra: Cultural Entrepreneurship in a FAIR PRICE!! > > Our e-mail address is CASESOLUTIONSCENTRE (AT) GMAIL (DOT) COM. Please replace (at) by @ and (dot) by . > > YOU MUST WRITE FOLLOWING WHILE PLACING YOUR ORDER: > Complete Case Study Name > Authors > Case Study ID > Publisher of Case Study > Your Requirements / Case Questions > > Note: Do not reply to this post because we do not reply to posts here. If you need any Case Solution please send us an email. We can help you to get it. From rhodri at kynesim.co.uk Thu Jul 27 09:34:16 2017 From: rhodri at kynesim.co.uk (Rhodri James) Date: Thu, 27 Jul 2017 14:34:16 +0100 Subject: Basic python understanding In-Reply-To: <0eebb819-9ab3-0a21-bac0-5498c598506e@VybeNetworks.com> References: <87h8xyg0ts.fsf@gmail.com> <0eebb819-9ab3-0a21-bac0-5498c598506e@VybeNetworks.com> Message-ID: <8ffe843a-6b00-c011-87fa-58d213a7325e@kynesim.co.uk> On 27/07/17 13:24, D'Arcy Cain wrote: > On 07/27/2017 02:31 AM, Gregory Ewing wrote: >> I'd like to add that what you should really be looking for is >> not a Python programmer as such, but simply a good, competent >> programmer. >> >> Any decent programmer will be able to quickly pick up what >> they need to know about Python on the job. If they can't, >> then they're not good enough, and you shouldn't hire them. > > I'll second that. I once had to build a team of Python developers for a > major project. The pool of actual Python programmers was small so we > just advertised for programmers. In the interviews we used a test that > used C to determine their problem solving skills. We also looked for > new grads so that they didn't have to un-learn a bunch of stuff. We > wound up with an amazing team that managed to build the project in > record time. > > Lesson: Look for programmers, not Python (or Perl or C or C++ or Java > or...) programmers. This isn't universally true, I'm afraid. A friend of mine who is a very good C/assembler programmer simply cannot get his head around Python's mindset. If you want bullet-proof Flash programming code, he's your man. If you want Python-based unit tests for it, don't ask him. -- Rhodri James *-* Kynesim Ltd From sonnichs at gmail.com Thu Jul 27 09:44:57 2017 From: sonnichs at gmail.com (FS) Date: Thu, 27 Jul 2017 06:44:57 -0700 (PDT) Subject: Installing matplotlib on python3 In-Reply-To: <1813406c-e2bc-4efd-b8ba-eeec5a3666bb@googlegroups.com> References: <1813406c-e2bc-4efd-b8ba-eeec5a3666bb@googlegroups.com> Message-ID: <6e7c7e04-78c2-4813-8261-af73acc01e85@googlegroups.com> Thank you Jussi. I didn't realize there was a separate version--I have it installed now cheers fritz From grant.b.edwards at gmail.com Thu Jul 27 09:59:21 2017 From: grant.b.edwards at gmail.com (Grant Edwards) Date: Thu, 27 Jul 2017 13:59:21 +0000 (UTC) Subject: Basic python understanding References: <87h8xyg0ts.fsf@gmail.com> Message-ID: On 2017-07-27, Gregory Ewing wrote: > I'd like to add that what you should really be looking for is not a > Python programmer as such, but simply a good, competent programmer. > > Any decent programmer will be able to quickly pick up what they need > to know about Python on the job. In a matter of a week or two if they're at all competent. Even if they don't know Python, if they know _variety_ of other languages (scheme/lisp, Smalltalk, Java, C, FORTRAN, assembly) they'll be able to pick up Python quickly. If they have only ever used a single language, that may be a warning sign. -- Grant Edwards grant.b.edwards Yow! The SAME WAVE keeps at coming in and COLLAPSING gmail.com like a rayon MUU-MUU ... From darcy at VybeNetworks.com Thu Jul 27 10:21:36 2017 From: darcy at VybeNetworks.com (D'Arcy Cain) Date: Thu, 27 Jul 2017 10:21:36 -0400 Subject: Basic python understanding In-Reply-To: <8ffe843a-6b00-c011-87fa-58d213a7325e@kynesim.co.uk> References: <87h8xyg0ts.fsf@gmail.com> <0eebb819-9ab3-0a21-bac0-5498c598506e@VybeNetworks.com> <8ffe843a-6b00-c011-87fa-58d213a7325e@kynesim.co.uk> Message-ID: On 07/27/2017 09:34 AM, Rhodri James wrote: > On 27/07/17 13:24, D'Arcy Cain wrote: >> Lesson: Look for programmers, not Python (or Perl or C or C++ or Java >> or...) programmers. > > This isn't universally true, I'm afraid. A friend of mine who is a very > good C/assembler programmer simply cannot get his head around Python's > mindset. If you want bullet-proof Flash programming code, he's your > man. If you want Python-based unit tests for it, don't ask him. As I said, look for programmers, not programmers. -- D'Arcy J.M. Cain Vybe Networks Inc. http://www.VybeNetworks.com/ IM:darcy at Vex.Net VoIP: sip:darcy at VybeNetworks.com From darcy at VybeNetworks.com Thu Jul 27 10:22:53 2017 From: darcy at VybeNetworks.com (D'Arcy Cain) Date: Thu, 27 Jul 2017 10:22:53 -0400 Subject: Basic python understanding In-Reply-To: References: <87h8xyg0ts.fsf@gmail.com> Message-ID: On 07/27/2017 09:59 AM, Grant Edwards wrote: > If they have only ever used a single language, that may be a warning > sign. Or if they list every language that they have ever smelled. -- D'Arcy J.M. Cain Vybe Networks Inc. http://www.VybeNetworks.com/ IM:darcy at Vex.Net VoIP: sip:darcy at VybeNetworks.com From ganesh1pal at gmail.com Thu Jul 27 10:33:14 2017 From: ganesh1pal at gmail.com (Ganesh Pal) Date: Thu, 27 Jul 2017 20:03:14 +0530 Subject: unpacking elements in python - any tips u want to share ? Message-ID: Hello Python friends , I need some inputs on the efficient way to unpack the elements in python , I know this is a very basic question , just curious to know if there are better way ways to achieve it . For our initial discussion let?s start with list I have a list with say 7 elements say if I need to unpack first 3 elements in the list and pass it an argument to the new fuction, here is my elementary code >>var1 = ?? >>var2 = ? ? >>var 3 = ? ? >> var 4= ?? >> var5 = ?? >> var 6 = ?? >>var7 =?? >>> my_list = [] >>> my_list.append(1) >>> my_list.append(0xffe) >>> my_list.append(2) >>> my_list.append('4th element') >>> my_list.append('5th element') >>> my_list.append(2) >>> my_list.append(0xffe) >>> my_list [1, 4094, 2, '4th element', '5th element', 2, 4094] >>> if len(my_list) == 7 : ... var1,var2,var3,var4,var5,var6,var7 = my_list ... print var1,var2,var3,var4,var5,var6,var7 ... var8 = get_eighth_element(var1,int(var2),int(var3)) ?.. my_list.append(var8) ?.print my_list 1 4094 2 4th element 5th element 2 4094 1 4094 2 4th element 5th element 2 4094 01 In case of list , I can use slices too , to unpack the elements I am interested in . Example : say If I need to compare second and second last element in the list, here is the simple code using slices, >>my_list[] 1 4094 2 4th element 5th element 2 4094 01 >>> my_list[-2] 4094 >>> my_list[1] 4094 >>var1 = my_list[-2] >> var 2 = my_list[1] >>> if len(my_list) == 8: ... if my_list[-2] == my_list[1]: ... print "Test Passed" ... else: ... print "Test Failed" ... Test Passed what other ways to I have ( if at all) ? to unpack the elements in python Regards, Ganesh From lists at zopyx.com Thu Jul 27 14:55:40 2017 From: lists at zopyx.com (Andreas Jung) Date: Thu, 27 Jul 2017 20:55:40 +0200 Subject: Installation Python 3.6.x on Windows using command line installer (without GUI) Message-ID: I need to installed Python 3.6.x on Windows as part of an automated process without user-interaction. Recently Python releases provided MSI files for installation using the "msiexec" utility however there are no more MSI release files available for Python 3.6.X. Are there any alternatives? -aj -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 618 bytes Desc: OpenPGP digital signature URL: From best_lay at yahoo.com Thu Jul 27 21:21:03 2017 From: best_lay at yahoo.com (Wildman) Date: Thu, 27 Jul 2017 20:21:03 -0500 Subject: Recent Spam problem References: <871sp5hx6g.fsf@nightsong.com> Message-ID: On Tue, 25 Jul 2017 21:44:22 +0000, Grant Edwards wrote: > On 2017-07-25, Wildman via Python-list wrote: > >> The posts are being made through Google Groups. Forwarding >> the posts with headers to groups-abuse at google.com might help. > > I never has in the past. I (and many others) have for years and years > been plonking all posts made through Google Groups. Trust me, you'll > not miss out on anything worthwile. :) > >> I have sent a couple but if everyone here did it maybe Google will >> pay attention and do something. > > They won't. > > Just configure your .score file (or bogofilter or spamassassin or > whatever) to throw out all posts that have a Message-ID: header field > that ends in 'googlegroups.com'. That, grashopper, is the path to > serenity. In the past I never used a 'kill file' so I didn't consider it. However, I took your advice and created the score file and I will say the path to serenity is sweet. Thank you. -- GNU/Linux user #557453 "SERENITY NOW! SERENITY NOW!" -Frank Costanza From steve+python at pearwood.info Thu Jul 27 22:15:20 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 28 Jul 2017 12:15:20 +1000 Subject: Falsey Enums Message-ID: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> I has some Enums: from enum import Enum class X(Enum): Falsey = 0 Truthy = 1 Fakey = 2 and I want bool(X.Falsey) to be False, and the others to be True. What should I do? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rustompmody at gmail.com Thu Jul 27 22:35:06 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Thu, 27 Jul 2017 19:35:06 -0700 (PDT) Subject: Falsey Enums In-Reply-To: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> Message-ID: <3a755c9f-863d-43ac-b873-cdca8ae8477c@googlegroups.com> Isn't dunder-bool what you want? (dunder-nonzero in python2) Dunno if special caveats for Enums PS sorry for phone-post -- I've broken my leg From dan at tombstonezero.net Thu Jul 27 22:42:26 2017 From: dan at tombstonezero.net (Dan Sommers) Date: Fri, 28 Jul 2017 02:42:26 +0000 (UTC) Subject: Falsey Enums References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, 28 Jul 2017 12:15:20 +1000, Steve D'Aprano wrote: > I has some Enums: > > from enum import Enum > class X(Enum): > Falsey = 0 > Truthy = 1 > Fakey = 2 > > > and I want bool(X.Falsey) to be False, and the others to be True. What should I > do? Add the following to your enum: def __bool__(self): return False if self == X.Falsey else True But something tells me that you would know that there exists such a simple solution. What am I missing? From no.email at nospam.invalid Fri Jul 28 03:06:49 2017 From: no.email at nospam.invalid (Paul Rubin) Date: Fri, 28 Jul 2017 00:06:49 -0700 Subject: Falsey Enums References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> Message-ID: <87shhhf3au.fsf@nightsong.com> Dan Sommers writes: > def __bool__(self): > return False if self == X.Falsey else True return self != X.Falsey From ethan at stoneleaf.us Fri Jul 28 03:52:23 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 28 Jul 2017 00:52:23 -0700 Subject: Falsey Enums In-Reply-To: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> Message-ID: <597AED37.1080900@stoneleaf.us> On 07/27/2017 07:15 PM, Steve D'Aprano wrote: > I has some Enums: > > from enum import Enum > class X(Enum): > Falsey = 0 > Truthy = 1 > Fakey = 2 > > > and I want bool(X.Falsey) to be False, and the others to be True. What should I > do? class X(Enum): Falsey = 0 Truthy = 1 Fakey = 2 def __bool__(self): return bool(self.value) -- ~Ethan~ From ben+python at benfinney.id.au Fri Jul 28 04:13:57 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 28 Jul 2017 18:13:57 +1000 Subject: Falsey Enums References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> <597AED37.1080900@stoneleaf.us> Message-ID: <85vamd2d2y.fsf@benfinney.id.au> Ethan Furman writes: > class X(Enum): > Falsey = 0 > Truthy = 1 > Fakey = 2 > def __bool__(self): > return bool(self.value) I am surprised this is not already the behaviour of an Enum class, without overriding the ?__bool__? method. What would be a good reason not to have this behaviour by default for ?Enum.__bool__?? (i.e. if this were reported as a bug on the ?enum.Enum? implementation, what would be good reasons not to fix it?) -- \ ?As scarce as truth is, the supply has always been in excess of | `\ the demand.? ?Josh Billings | _o__) | Ben Finney From rustompmody at gmail.com Fri Jul 28 05:00:03 2017 From: rustompmody at gmail.com (Rustom Mody) Date: Fri, 28 Jul 2017 02:00:03 -0700 (PDT) Subject: Falsey Enums In-Reply-To: References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> <597AED37.1080900@stoneleaf.us> <85vamd2d2y.fsf@benfinney.id.au> Message-ID: <6d96d768-7259-42d0-b26e-7bb462a6fe7b@googlegroups.com> On Friday, July 28, 2017 at 1:45:46 PM UTC+5:30, Ben Finney wrote: > Ethan Furman writes: > > > class X(Enum): > > Falsey = 0 > > Truthy = 1 > > Fakey = 2 > > def __bool__(self): > > return bool(self.value) > > I am surprised this is not already the behaviour of an Enum class, > without overriding the ?__bool__? method. > > What would be a good reason not to have this behaviour by default for > ?Enum.__bool__?? (i.e. if this were reported as a bug on the ?enum.Enum? > implementation, what would be good reasons not to fix it?) Enums are for abstracting away from ints (typically small) to more meaningful names. In python's terms that means whether X.Truthy should mean 0 ? the value ? or "Truthy" ? the name ? is intentionally left ambiguous/undecided. Observe: >>> print (X.Truthy) X.Truthy # So Truthy is well Truthy >>> X.Truthy # No! Truthy is 1 # In other words >>> repr(X.Truthy) '' >>> str(X.Truthy) 'X.Truthy' >>> From steve+python at pearwood.info Fri Jul 28 06:28:29 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Fri, 28 Jul 2017 20:28:29 +1000 Subject: Falsey Enums References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> <597AED37.1080900@stoneleaf.us> Message-ID: <597b11cf$0$1591$c3e8da3$5496439d@news.astraweb.com> On Fri, 28 Jul 2017 05:52 pm, Ethan Furman wrote: > class X(Enum): > Falsey = 0 > Truthy = 1 > Fakey = 2 > def __bool__(self): > return bool(self.value) Thanks Ethan. Like Ben, I'm surprised that's not the default behaviour. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From rosuav at gmail.com Fri Jul 28 07:14:41 2017 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 28 Jul 2017 21:14:41 +1000 Subject: Falsey Enums In-Reply-To: <597b11cf$0$1591$c3e8da3$5496439d@news.astraweb.com> References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> <597AED37.1080900@stoneleaf.us> <597b11cf$0$1591$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Fri, Jul 28, 2017 at 8:28 PM, Steve D'Aprano wrote: > On Fri, 28 Jul 2017 05:52 pm, Ethan Furman wrote: > >> class X(Enum): >> Falsey = 0 >> Truthy = 1 >> Fakey = 2 >> def __bool__(self): >> return bool(self.value) > > Thanks Ethan. > > Like Ben, I'm surprised that's not the default behaviour. Because members of an Enum are considered to be "things". If you want them to behave more like integers, instead subclass IntEnum: >>> class Y(IntEnum): ... Falsey = 0 ... Truthy = 1 ... File_Not_Found = 2 ... >>> Y.Falsey >>> bool(Y.Falsey) False >>> bool(Y.Truthy) True Among other differences, this means that zero is considered falsey, and that the enumerated variables compare equal to the corresponding integers. ChrisA From nomail at com.invalid Fri Jul 28 08:45:41 2017 From: nomail at com.invalid (ast) Date: Fri, 28 Jul 2017 14:45:41 +0200 Subject: Getting a dictionnary from a module's variables Message-ID: <597b31fb$0$4823$426a34cc@news.free.fr> Hello I have a file conf.py which only contains some variables definition like that: a = 7 b = 9 c = 3 In my main program I would like to get a dictionnary dico = {'a' :7,'b':9, 'c':3} I tried: import conf dico = vars(conf) but there is among a huge amount of stuff to remove dir(config) provides a list, some processing is needed to remove some __identifiers__ and get a dict Is there a simple way to do that, without processing ? From nomail at com.invalid Fri Jul 28 09:10:58 2017 From: nomail at com.invalid (ast) Date: Fri, 28 Jul 2017 15:10:58 +0200 Subject: Getting a dictionnary from a module's variables In-Reply-To: <597b31fb$0$4823$426a34cc@news.free.fr> References: <597b31fb$0$4823$426a34cc@news.free.fr> Message-ID: <597b37eb$0$3624$426a34cc@news.free.fr> "ast" a ?crit dans le message de news:597b31fb$0$4823$426a34cc at news.free.fr... I answer to myself import config dico = {k:v for k, v in vars(conf).items() if not (k.startswith('__') or k.endswith('__'))} that's not so difficult From skip.montanaro at gmail.com Fri Jul 28 09:19:27 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Fri, 28 Jul 2017 08:19:27 -0500 Subject: Recent Spam problem In-Reply-To: References: Message-ID: Yes, it's more "leaky," though that's not quite the term I'd use. Instead, I'd say there are fewer checks. On the mailing list side of things, you have all the Postfix bells and whistles, which stop a ton of crap, much of it before the message is officially entered into the mail.python.org system. Behind that is a SpamBayes instance to pick up any loose ends. The Usenet gateway feeds into the system behind everything except the SpamBayes instance. It gets only sporadic attention from me. If I'm not paying attention, stuff which starts to "leak" through doesn't get trained as spam so it can help minimize the chances that later versions of the same crap get through. One thing which never got produced was an easy way for a list moderator to say, "Hey, this got through and it's spam." Sorting through "unsure" messages and retraining automatically using some Mailman/SpamBayes conduit would be a nice addition to the overall system. If you wanted to write software, that's where I'd focus my efforts. Skip From ethan at stoneleaf.us Fri Jul 28 10:24:24 2017 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 28 Jul 2017 07:24:24 -0700 Subject: Falsey Enums In-Reply-To: <85vamd2d2y.fsf@benfinney.id.au> References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> <597AED37.1080900@stoneleaf.us> <85vamd2d2y.fsf@benfinney.id.au> Message-ID: <597B4918.6070303@stoneleaf.us> On 07/28/2017 01:13 AM, Ben Finney wrote: > Ethan Furman writes: >> class X(Enum): >> Falsey = 0 >> Truthy = 1 >> Fakey = 2 >> def __bool__(self): >> return bool(self.value) > > I am surprised this is not already the behaviour of an Enum class, > without overriding the ?__bool__? method. > > What would be a good reason not to have this behaviour by default for > ?Enum.__bool__?? (i.e. if this were reported as a bug on the ?enum.Enum? > implementation, what would be good reasons not to fix it?) It matches the docs. ;) https://docs.python.org/3/library/enum.html#boolean-value-of-enum-classes-and-members Enum members that are mixed with non-Enum types (such as int, str, etc.) are evaluated according to the mixed-in type?s rules; otherwise, all members evaluate as True. To make your own Enum?s boolean evaluation depend on the member?s value add the following to your class: Enum classes always evaluate as True. The rationale is in PEP 435, in the functional API section. The reason for defaulting to 1 as the starting number and not 0 is that 0 is False in a boolean sense, but enum members all evaluate to True . Which still begs the question of: why? According to memory, and as Rustom guessed, an Enum member is a thing, and all things in Python default to being True. If `bool(thing) == False` is a desirable characteristic then extra steps must be taken. Either: - subclass a type that already has that characteristic, such as int or float, or - add your own __bool__ method (__nonzero__ in Python 2) Looked at another way: The value attribute is just that -- an attribute. Depending on the use-case that attribute can be irrelevant (which is why we have auto(), etc.), so by default Enum does not use the .value attribute in figuring out if a member is something vs nothing (or truthy vs falsey). -- ~Ethan~ From jobmattcon at gmail.com Fri Jul 28 11:03:44 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Fri, 28 Jul 2017 08:03:44 -0700 (PDT) Subject: how to group by function if one of the group has relationship with another one in the group? In-Reply-To: References: Message-ID: <43044148-8054-4932-afad-d4ace64267ab@googlegroups.com> actually i used in this application if same color is neighbor like connected then group them i use for segmentation of words in screen capture https://stackoverflow.com/questions/45294829/how-to-group-by-function-if-any-one-of-the-group-members-has-neighbor-relationsh i asked here too, but i do not know how to use partial and do not know what center is. On Tuesday, July 25, 2017 at 5:00:25 PM UTC+8, Peter Otten wrote: > Ho Yeung Lee wrote: > > > from itertools import groupby > > > > testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > > def isneighborlocation(lo1, lo2): > > if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: > > return 1 > > elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: > > return 1 > > else: > > return 0 > > > > groupda = groupby(testing1, isneighborlocation) > > for key, group1 in groupda: > > print key > > for thing in group1: > > print thing > > > > expect output 3 group > > group1 [(1,1)] > > group2 [(2,3),(2,4] > > group3 [(3,5),(3,6),(4,6)] > > groupby() calculates the key value from the current item only, so there's no > "natural" way to apply it to your problem. > > Possible workarounds are to feed it pairs of neighbouring items (think > zip()) or a stateful key function. Below is an example of the latter: > > $ cat sequential_group_class.py > from itertools import groupby > > missing = object() > > class PairKey: > def __init__(self, continued): > self.prev = missing > self.continued = continued > self.key = False > > def __call__(self, item): > if self.prev is not missing and not self.continued(self.prev, item): > self.key = not self.key > self.prev = item > return self.key > > def isneighborlocation(lo1, lo2): > x1, y1 = lo1 > x2, y2 = lo2 > dx = x1 - x2 > dy = y1 - y2 > return dx*dx + dy*dy <= 1 > > items = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > > for key, group in groupby(items, key=PairKey(isneighborlocation)): > print key, list(group) > > $ python sequential_group_class.py > False [(1, 1)] > True [(2, 3), (2, 4)] > False [(3, 5), (3, 6), (4, 6)] From jobmattcon at gmail.com Fri Jul 28 11:04:57 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Fri, 28 Jul 2017 08:04:57 -0700 (PDT) Subject: how to group by function if one of the group has relationship with another one in the group? In-Reply-To: <59783134$0$3605$426a74cc@news.free.fr> References: <59783134$0$3605$426a74cc@news.free.fr> Message-ID: actually i used in this application if same color is neighbor like connected then group them i use for segmentation of words in screen capture https://stackoverflow.com/questions/45294829/how-to-group-by-function-if-any-one-of-the-group-members-has-neighbor-relationsh i asked here too, but i do not know how to use partial and do not know what center is. On Wednesday, July 26, 2017 at 2:06:08 PM UTC+8, ast wrote: > "Ho Yeung Lee" a ?crit dans le message de > news:ef0bd11a-bf55-42a2-b016-d93f3b831860 at googlegroups.com... > > from itertools import groupby > > > > testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > > def isneighborlocation(lo1, lo2): > > if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: > > return 1 > > elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: > > return 1 > > else: > > return 0 > > > > groupda = groupby(testing1, isneighborlocation) > > for key, group1 in groupda: > > print key > > for thing in group1: > > print thing > > > > expect output 3 group > > group1 [(1,1)] > > group2 [(2,3),(2,4] > > group3 [(3,5),(3,6),(4,6)] > > Its not clear to me how you build the groups > > Why (1,1) is not in group2 since (1,1) is > a neighbor to both (2,3) and (2,4) ? From irmen.NOSPAM at xs4all.nl Fri Jul 28 12:36:05 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Fri, 28 Jul 2017 18:36:05 +0200 Subject: zipapp should not include temporary files? In-Reply-To: <9c536053-b7a8-4829-92f3-6099e84a726d@googlegroups.com> References: <5978d330$0$834$e4fe514c@news.xs4all.nl> <9c536053-b7a8-4829-92f3-6099e84a726d@googlegroups.com> Message-ID: <597b67f3$0$800$e4fe514c@news.xs4all.nl> On 27/07/2017 00:03, Paul Moore wrote: > On Wednesday, 26 July 2017 18:37:15 UTC+1, Irmen de Jong wrote: >> What do you think? Should the zipapp module perhaps be improved to automatically skip >> obvious temporary files or perhaps allow to provide a filter function? > If you want to create a feature request for a filter function on bugs.python.org and assign it to me, I'll take a look at it. \ I will do this, thanks in advance. Irmen From irmen.NOSPAM at xs4all.nl Fri Jul 28 12:37:06 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Fri, 28 Jul 2017 18:37:06 +0200 Subject: Installation Python 3.6.x on Windows using command line installer (without GUI) In-Reply-To: References: Message-ID: <597b6831$0$800$e4fe514c@news.xs4all.nl> On 27/07/2017 20:55, Andreas Jung wrote: > > I need to installed Python 3.6.x on Windows as part of an automated process without user-interaction. Recently Python releases provided MSI files for installation using the "msiexec" utility however there are no more MSI release files available for Python 3.6.X. Are there any alternatives? > > -aj > https://docs.python.org/3/using/windows.html#installing-without-ui Irmen From irmen.NOSPAM at xs4all.nl Fri Jul 28 12:52:42 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Fri, 28 Jul 2017 18:52:42 +0200 Subject: zipapp should not include temporary files? In-Reply-To: <597b67f3$0$800$e4fe514c@news.xs4all.nl> References: <5978d330$0$834$e4fe514c@news.xs4all.nl> <9c536053-b7a8-4829-92f3-6099e84a726d@googlegroups.com> <597b67f3$0$800$e4fe514c@news.xs4all.nl> Message-ID: <597b6bd9$0$752$e4fe514c@news.xs4all.nl> On 28/07/2017 18:36, Irmen de Jong wrote: > On 27/07/2017 00:03, Paul Moore wrote: >> If you want to create a feature request for a filter function on bugs.python.org and assign it to me, I'll take a look at it. \ > > > I will do this, thanks in advance. Should have included a link perhaps. Here it is: http://bugs.python.org/issue31072 Irmen From greg.ewing at canterbury.ac.nz Fri Jul 28 19:04:02 2017 From: greg.ewing at canterbury.ac.nz (Gregory Ewing) Date: Sat, 29 Jul 2017 11:04:02 +1200 Subject: Getting a dictionnary from a module's variables In-Reply-To: <597b31fb$0$4823$426a34cc@news.free.fr> References: <597b31fb$0$4823$426a34cc@news.free.fr> Message-ID: ast wrote: > dir(config) provides a list, some processing is needed to > remove some __identifiers__ and get a dict > > Is there a simple way to do that, without processing ? Here's an alternative that leverages the import machinery. d = {} exec("from config import *", d) del d['__builtins__'] -- Greg From pavol.lisy at gmail.com Sat Jul 29 03:37:20 2017 From: pavol.lisy at gmail.com (Pavol Lisy) Date: Sat, 29 Jul 2017 09:37:20 +0200 Subject: Falsey Enums In-Reply-To: <597b11cf$0$1591$c3e8da3$5496439d@news.astraweb.com> References: <597a9e39$0$1621$c3e8da3$5496439d@news.astraweb.com> <597AED37.1080900@stoneleaf.us> <597b11cf$0$1591$c3e8da3$5496439d@news.astraweb.com> Message-ID: On 7/28/17, Steve D'Aprano wrote: > On Fri, 28 Jul 2017 05:52 pm, Ethan Furman wrote: > >> class X(Enum): >> Falsey = 0 >> Truthy = 1 >> Fakey = 2 >> def __bool__(self): >> return bool(self.value) > > Thanks Ethan. BTW bool at enum seems to be expensive: %timeit 7 if x else 0 850 ns ? 12.9 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) %timeit 7 if x.value else 0 479 ns ? 4.38 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) %timeit 7 if x!=X.Falsey else 0 213 ns ? 10.6 ns per loop (mean ? std. dev. of 7 runs, 1000000 loops each) > Like Ben, I'm surprised that's not the default behaviour. Me too. I was trying to find some example which is not completely semantically wrong and where backward compatibility is important. Maybe something like this? -> class Color(Enum): black = 0 # this could equal to some external library constant value for black white = 0xff color = user_choice() if not color: # color is None color = random.choice(list(Color)) From kryptxy at protonmail.com Sat Jul 29 04:34:20 2017 From: kryptxy at protonmail.com (Kryptxy) Date: Sat, 29 Jul 2017 04:34:20 -0400 Subject: Need some advice please Message-ID: <1lL1zq18y0kwNzVcvCHkN6esH-R3DP59hLVInGHeEPDsgGW9ZLQ-qbWoQSpKQxLV8XM7Df6cfCtiHohEG7Fm1TO_aIanqnVNHPwoz4LBIqw=@protonmail.com> Hello, I have developed a python program (tool) that fetches torrents from thepiratebay proxy sites and displays results in console/terminal window. Here: github.com/kryptxy/torrench Now, since thepiratebay contains illegal content, I am restricted from packaging this tool for other distros (Eg. AUR). As of now, the website (https://proxybay.one/) (which consists of tpb proxy site list) is hard-coded in find_url.py file. Now to bypass any legalities, I am thinking of providing a configuration file, that consists of the website's url, along with an enabling switch. The configuration file will NOT be hosted on github (means it will not be provided with package). Instead, the user will have to download the config file. If config file is configured, tool will work. Else it would prompt to configure the file. Would it get me around legal issues, that is making this tool completely legal? I could really use sone advice here. Thank you. From ttopolewski at gmail.com Sat Jul 29 05:12:04 2017 From: ttopolewski at gmail.com (ttopolewski at gmail.com) Date: Sat, 29 Jul 2017 02:12:04 -0700 (PDT) Subject: Default logging as part of the language Message-ID: Hello, I'm wondering what do You think about some default logging that can become a part of the Python language and part of the development workflow. I would see it as develop in debug mode untill debug is removed intentionally. It would include: - standard logging location for new development - some incentive to remove logging For example new keyword defd - something that would: - act as regular def, but log own identification and input/output __repr__ to /var/log/{processname.processid} - send WARN to stdout if filename of function/method defd definition time is old(like 2y.o?) to make people to replace defd as regular def PS. I don't have example implementation and I have not tried such approach so I can't say by experience if it makes any sense. But still I think it can be at least something to consider. From steve+python at pearwood.info Sat Jul 29 05:31:57 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sat, 29 Jul 2017 19:31:57 +1000 Subject: Need some advice please References: <1lL1zq18y0kwNzVcvCHkN6esH-R3DP59hLVInGHeEPDsgGW9ZLQ-qbWoQSpKQxLV8XM7Df6cfCtiHohEG7Fm1TO_aIanqnVNHPwoz4LBIqw=@protonmail.com> Message-ID: <597c560e$0$1618$c3e8da3$5496439d@news.astraweb.com> On Sat, 29 Jul 2017 06:34 pm, Kryptxy wrote: > Would it get me around legal issues, that is making this tool completely > legal? Do you think we are lawyers? We're not. Even if we were, we're not lawyers who are expert on the legal system of every country in the world. What country's laws are you asking about? -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From ned at nedbatchelder.com Sat Jul 29 07:28:41 2017 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 29 Jul 2017 07:28:41 -0400 Subject: Default logging as part of the language In-Reply-To: References: Message-ID: <6284b30d-452a-e457-7246-2b9fde1d06eb@nedbatchelder.com> On 7/29/17 5:12 AM, ttopolewski at gmail.com wrote: > Hello, > I'm wondering what do You think about some default logging that can become a part of the Python language and part of the development workflow. > > I would see it as develop in debug mode untill debug is removed intentionally. > It would include: > - standard logging location for new development > - some incentive to remove logging > > > For example new keyword defd - something that would: > - act as regular def, but log own identification and input/output __repr__ to /var/log/{processname.processid} > - send WARN to stdout if filename of function/method defd definition time is old(like 2y.o?) to make people to replace defd as regular def > > > PS. I don't have example implementation and I have not tried such approach so I can't say by experience if it makes any sense. But still I think it can be at least something to consider. > This sounds like a great thing to provide as a library. There's no need for a new keyword. A function decorator could do what you are describing. --Ned. From piet-l at vanoostrum.org Sat Jul 29 10:32:28 2017 From: piet-l at vanoostrum.org (Piet van Oostrum) Date: Sat, 29 Jul 2017 16:32:28 +0200 Subject: Python BeautifulSoup extract html table cells that contains images and text References: <4602c354-34b5-4962-b1d1-59d110c6cd36@googlegroups.com> Message-ID: Umar Yusuf writes: > Hi all, > > I need help extracting the table from this url...? > > from bs4 import BeautifulSoup > url = "https://www.marinetraffic.com/en/ais/index/ports/all/per_page:50" > > headers = {'User-agent': 'Mozilla/5.0'} > raw_html = requests.get(url, headers=headers) > > raw_data = raw_html.text > soup_data = BeautifulSoup(raw_data, "lxml") > > td = soup_data.findAll('tr')[1:] > > country = [] > > for data in td: > col = data.find_all('td') > country.append(col) So what data do you want to extract? -- Piet van Oostrum WWW: http://piet.vanoostrum.org/ PGP key: [8DAE142BE17999C4] From piet-l at vanoostrum.org Sat Jul 29 11:02:03 2017 From: piet-l at vanoostrum.org (Piet van Oostrum) Date: Sat, 29 Jul 2017 17:02:03 +0200 Subject: how to group by function if one of the group has relationship with another one in the group? References: Message-ID: Peter Otten <__peter__ at web.de> writes: > Ho Yeung Lee wrote: > >> from itertools import groupby >> >> testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] >> def isneighborlocation(lo1, lo2): >> if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: >> return 1 >> elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: >> return 1 >> else: >> return 0 >> >> groupda = groupby(testing1, isneighborlocation) >> for key, group1 in groupda: >> print key >> for thing in group1: >> print thing >> >> expect output 3 group >> group1 [(1,1)] >> group2 [(2,3),(2,4] >> group3 [(3,5),(3,6),(4,6)] > > groupby() calculates the key value from the current item only, so there's no > "natural" way to apply it to your problem. > > Possible workarounds are to feed it pairs of neighbouring items (think > zip()) or a stateful key function. Below is an example of the latter: > > $ cat sequential_group_class.py > from itertools import groupby > > missing = object() > > class PairKey: > def __init__(self, continued): > self.prev = missing > self.continued = continued > self.key = False > > def __call__(self, item): > if self.prev is not missing and not self.continued(self.prev, item): > self.key = not self.key > self.prev = item > return self.key > > def isneighborlocation(lo1, lo2): > x1, y1 = lo1 > x2, y2 = lo2 > dx = x1 - x2 > dy = y1 - y2 > return dx*dx + dy*dy <= 1 > > items = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > > for key, group in groupby(items, key=PairKey(isneighborlocation)): > print key, list(group) > > $ python sequential_group_class.py > False [(1, 1)] > True [(2, 3), (2, 4)] > False [(3, 5), (3, 6), (4, 6)] That only works if (a) The elements in the list are already clustered on group (i.e. all elements of a group are adjacent) (b) In a group the order is such that adjacent elements are direct neigbours, i.e. their distance is at most 1. So 'groupby' is not a natural solution for this problem. -- Piet van Oostrum WWW: http://piet.vanoostrum.org/ PGP key: [8DAE142BE17999C4] From 90ldst31n at gmail.com Sat Jul 29 11:27:47 2017 From: 90ldst31n at gmail.com (Goldstein) Date: Sat, 29 Jul 2017 18:27:47 +0300 Subject: YAML in std lib? Message-ID: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> Hello. I'm new in this mailing list and, in fact, I've registered for one simple question. Why YAML is not yet included in the standard Python library? It's the most pythonic markup language, I think, and it's pretty popular. From rantingrickjohnson at gmail.com Sat Jul 29 12:59:42 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sat, 29 Jul 2017 09:59:42 -0700 (PDT) Subject: Need some advice please In-Reply-To: <597c560e$0$1618$c3e8da3$5496439d@news.astraweb.com> References: <1lL1zq18y0kwNzVcvCHkN6esH-R3DP59hLVInGHeEPDsgGW9ZLQ-qbWoQSpKQxLV8XM7Df6cfCtiHohEG7Fm1TO_aIanqnVNHPwoz4LBIqw=@protonmail.com> <597c560e$0$1618$c3e8da3$5496439d@news.astraweb.com> Message-ID: On Saturday, July 29, 2017 at 4:59:26 AM UTC-5, Steve D'Aprano wrote: > On Sat, 29 Jul 2017 06:34 pm, Kryptxy wrote: > > > Would it get me around legal issues, that is making this > > tool completely legal? > > Do you think we are lawyers? We're not. Even if we were, > we're not lawyers who are expert on the legal system of > every country in the world. What country's laws are you > asking about? There are only five things that members of Python-list (aka: comp.lang.python) excel at: (1) Bikeshedding. (2) Doing your homework for you. (3) Googling for you. (4) Arguing endlessly over minutiae. (5) Becoming highly agitated at even the _slightest_ preception of political incorrectness. (Yes. It seems that for the "tolerant" among us, some things are just too intolerable to be tolerated. But i digress...) So if you have any questions, statements or trolls that fall within these specific catagories, then you will typically get a response. Perhaps not the response you expected. Or even a polite response. But a response you will get, nonetheless. Thank you And have a nice day. From python at mrabarnett.plus.com Sat Jul 29 14:38:32 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 29 Jul 2017 19:38:32 +0100 Subject: Need some advice please In-Reply-To: References: <1lL1zq18y0kwNzVcvCHkN6esH-R3DP59hLVInGHeEPDsgGW9ZLQ-qbWoQSpKQxLV8XM7Df6cfCtiHohEG7Fm1TO_aIanqnVNHPwoz4LBIqw=@protonmail.com> <597c560e$0$1618$c3e8da3$5496439d@news.astraweb.com> Message-ID: <79c0cb53-312f-2218-c85b-839fcaaf6892@mrabarnett.plus.com> On 2017-07-29 17:59, Rick Johnson wrote: > On Saturday, July 29, 2017 at 4:59:26 AM UTC-5, Steve D'Aprano wrote: >> On Sat, 29 Jul 2017 06:34 pm, Kryptxy wrote: >> >> > Would it get me around legal issues, that is making this >> > tool completely legal? >> >> Do you think we are lawyers? We're not. Even if we were, >> we're not lawyers who are expert on the legal system of >> every country in the world. What country's laws are you >> asking about? > > There are only five things that members of Python-list > (aka: comp.lang.python) excel at: > > (1) Bikeshedding. > > (2) Doing your homework for you. > > (3) Googling for you. > > (4) Arguing endlessly over minutiae. > > (5) Becoming highly agitated at even the _slightest_ > preception of political incorrectness. (Yes. It seems that > for the "tolerant" among us, some things are just too > intolerable to be tolerated. But i digress...) > What is the difference between (1) and (4)? [snip] From devinderaujla at gmail.com Sat Jul 29 15:16:10 2017 From: devinderaujla at gmail.com (new_to_c0ding) Date: Sat, 29 Jul 2017 12:16:10 -0700 (PDT) Subject: Need help to understand not the answer Message-ID: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> Hello all, I have been scratching my head since morning but could not understand this quiz question. I would appreciate if someone could help me understand what is it asking me to do. I dont need the answer but just the right direction to look at. ### Do not change the Location or Campus classes. ### ### Location class is the same as in lecture. ### class Location(object): def __init__(self, x, y): self.x = x self.y = y def move(self, deltaX, deltaY): return Location(self.x + deltaX, self.y + deltaY) def getX(self): return self.x def getY(self): return self.y def dist_from(self, other): xDist = self.x - other.x yDist = self.y - other.y return (xDist**2 + yDist**2)**0.5 def __eq__(self, other): return (self.x == other.x and self.y == other.y) def __str__(self): return '<' + str(self.x) + ',' + str(self.y) + '>' class Campus(object): def __init__(self, center_loc): self.center_loc = center_loc def __str__(self): return str(self.center_loc) class MITCampus(Campus): """ A MITCampus is a Campus that contains tents """ def __init__(self, center_loc, tent_loc = Location(0,0)): """ Assumes center_loc and tent_loc are Location objects Initializes a new Campus centered at location center_loc with a tent at location tent_loc """ # Your code here def add_tent(self, new_tent_loc): """ Assumes new_tent_loc is a Location Adds new_tent_loc to the campus only if the tent is at least 0.5 distance away from all other tents already there. Campus is unchanged otherwise. Returns True if it could add the tent, False otherwise. """ # Your code here def remove_tent(self, tent_loc): """ Assumes tent_loc is a Location Removes tent_loc from the campus. Raises a ValueError if there is not a tent at tent_loc. Does not return anything """ # Your code here def get_tents(self): """ Returns a list of all tents on the campus. The list should contain the string representation of the Location of a tent. The list should be sorted by the x coordinate of the location. """ # Your code here -=-=-=-=-=-=-= For example, if c = MITCampus(Location(1,2)) then executing the following sequence of commands: c.add_tent(Location(2,3)) should return True c.add_tent(Location(1,2)) should return True c.add_tent(Location(0,0)) should return False c.add_tent(Location(2,3)) should return False c.get_tents() should return ['<0,0>', '<1,2>', '<2,3>'] -=-=-=-=-=-=- Now as per instructions, class MITCampus(Campus) has (self, center_loc, tent_loc = Location(0,0)) and it is mentioned that center_loc and tent_loc are Location objects but when I code them as Locations, I get error from the tester: Traceback (most recent call last): File "submission.py", line 61, in __init__ self.cloc=Location(center_loc) TypeError: __init__() missing 1 required positional argument: 'y' -=-=-=-== Please help From piet-l at vanoostrum.org Sat Jul 29 15:44:42 2017 From: piet-l at vanoostrum.org (Piet van Oostrum) Date: Sat, 29 Jul 2017 21:44:42 +0200 Subject: Need help to understand not the answer References: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> Message-ID: new_to_c0ding writes: > Now as per instructions, class MITCampus(Campus) has (self, center_loc, tent_loc = Location(0,0)) and it is mentioned that center_loc and tent_loc are Location objects but when I code them as Locations, I get error from the tester: > Traceback (most recent call last): > File "submission.py", line 61, in __init__ > self.cloc=Location(center_loc) > TypeError: __init__() missing 1 required positional argument: 'y' Location must be called with 2 parameters: a x and a y coordinate, not with another location as parameter. -- Piet van Oostrum WWW: http://piet.vanoostrum.org/ PGP key: [8DAE142BE17999C4] From python at mrabarnett.plus.com Sat Jul 29 15:49:27 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 29 Jul 2017 20:49:27 +0100 Subject: Need help to understand not the answer In-Reply-To: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> References: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> Message-ID: <38ae4397-5450-99ab-8f0b-19430d5ade20@mrabarnett.plus.com> On 2017-07-29 20:16, new_to_c0ding wrote: > Hello all, > I have been scratching my head since morning but could not understand this quiz question. I would appreciate if someone could help me understand what is it asking me to do. I dont need the answer but just the right direction to look at. > > ### Do not change the Location or Campus classes. ### > ### Location class is the same as in lecture. ### > class Location(object): > def __init__(self, x, y): > self.x = x > self.y = y > def move(self, deltaX, deltaY): > return Location(self.x + deltaX, self.y + deltaY) > def getX(self): > return self.x > def getY(self): > return self.y > def dist_from(self, other): > xDist = self.x - other.x > yDist = self.y - other.y > return (xDist**2 + yDist**2)**0.5 > def __eq__(self, other): > return (self.x == other.x and self.y == other.y) > def __str__(self): > return '<' + str(self.x) + ',' + str(self.y) + '>' > > class Campus(object): > def __init__(self, center_loc): > self.center_loc = center_loc > def __str__(self): > return str(self.center_loc) > class MITCampus(Campus): > """ A MITCampus is a Campus that contains tents """ > def __init__(self, center_loc, tent_loc = Location(0,0)): > """ Assumes center_loc and tent_loc are Location objects > Initializes a new Campus centered at location center_loc > with a tent at location tent_loc """ > # Your code here > > def add_tent(self, new_tent_loc): > """ Assumes new_tent_loc is a Location > Adds new_tent_loc to the campus only if the tent is at least 0.5 distance > away from all other tents already there. Campus is unchanged otherwise. > Returns True if it could add the tent, False otherwise. """ > # Your code here > > def remove_tent(self, tent_loc): > """ Assumes tent_loc is a Location > Removes tent_loc from the campus. > Raises a ValueError if there is not a tent at tent_loc. > Does not return anything """ > # Your code here > > def get_tents(self): > """ Returns a list of all tents on the campus. The list should contain > the string representation of the Location of a tent. The list should > be sorted by the x coordinate of the location. """ > # Your code here > > > > -=-=-=-=-=-=-= > > For example, if c = MITCampus(Location(1,2)) then executing the following sequence of commands: > > c.add_tent(Location(2,3)) should return True > c.add_tent(Location(1,2)) should return True > c.add_tent(Location(0,0)) should return False > c.add_tent(Location(2,3)) should return False > c.get_tents() should return ['<0,0>', '<1,2>', '<2,3>'] > > -=-=-=-=-=-=- > > Now as per instructions, class MITCampus(Campus) has (self, center_loc, tent_loc = Location(0,0)) and it is mentioned that center_loc and tent_loc are Location objects but when I code them as Locations, I get error from the tester: > Traceback (most recent call last): > File "submission.py", line 61, in __init__ > self.cloc=Location(center_loc) > TypeError: __init__() missing 1 required positional argument: 'y' > > -=-=-=-== > > Please help > Location.__init__ expects 3 arguments: self, x, y self is already provided, so that leaves 2 arguments: x, y You're giving it only 1 argument: center_loc What is center_loc? Is it a tuple? If it is, then you could do: self.cloc=Location(center_loc[0], center_loc[1]) or: self.cloc=Location(*center_loc) From devinderaujla at gmail.com Sat Jul 29 16:06:40 2017 From: devinderaujla at gmail.com (devinderaujla at gmail.com) Date: Sat, 29 Jul 2017 13:06:40 -0700 (PDT) Subject: Need help to understand not the answer In-Reply-To: References: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> <38ae4397-5450-99ab-8f0b-19430d5ade20@mrabarnett.plus.com> Message-ID: <83dcde55-f4ae-42fb-a93b-d7ec307c874e@googlegroups.com> On Saturday, July 29, 2017 at 3:49:55 PM UTC-4, MRAB wrote: > On 2017-07-29 20:16, new_to_c0ding wrote: > > Hello all, > > I have been scratching my head since morning but could not understand this quiz question. I would appreciate if someone could help me understand what is it asking me to do. I dont need the answer but just the right direction to look at. > > > > ### Do not change the Location or Campus classes. ### > > ### Location class is the same as in lecture. ### > > class Location(object): > > def __init__(self, x, y): > > self.x = x > > self.y = y > > def move(self, deltaX, deltaY): > > return Location(self.x + deltaX, self.y + deltaY) > > def getX(self): > > return self.x > > def getY(self): > > return self.y > > def dist_from(self, other): > > xDist = self.x - other.x > > yDist = self.y - other.y > > return (xDist**2 + yDist**2)**0.5 > > def __eq__(self, other): > > return (self.x == other.x and self.y == other.y) > > def __str__(self): > > return '<' + str(self.x) + ',' + str(self.y) + '>' > > > > class Campus(object): > > def __init__(self, center_loc): > > self.center_loc = center_loc > > def __str__(self): > > return str(self.center_loc) > > class MITCampus(Campus): > > """ A MITCampus is a Campus that contains tents """ > > def __init__(self, center_loc, tent_loc = Location(0,0)): > > """ Assumes center_loc and tent_loc are Location objects > > Initializes a new Campus centered at location center_loc > > with a tent at location tent_loc """ > > # Your code here > > > > def add_tent(self, new_tent_loc): > > """ Assumes new_tent_loc is a Location > > Adds new_tent_loc to the campus only if the tent is at least 0.5 distance > > away from all other tents already there. Campus is unchanged otherwise. > > Returns True if it could add the tent, False otherwise. """ > > # Your code here > > > > def remove_tent(self, tent_loc): > > """ Assumes tent_loc is a Location > > Removes tent_loc from the campus. > > Raises a ValueError if there is not a tent at tent_loc. > > Does not return anything """ > > # Your code here > > > > def get_tents(self): > > """ Returns a list of all tents on the campus. The list should contain > > the string representation of the Location of a tent. The list should > > be sorted by the x coordinate of the location. """ > > # Your code here > > > > > > > > -=-=-=-=-=-=-= > > > > For example, if c = MITCampus(Location(1,2)) then executing the following sequence of commands: > > > > c.add_tent(Location(2,3)) should return True > > c.add_tent(Location(1,2)) should return True > > c.add_tent(Location(0,0)) should return False > > c.add_tent(Location(2,3)) should return False > > c.get_tents() should return ['<0,0>', '<1,2>', '<2,3>'] > > > > -=-=-=-=-=-=- > > > > Now as per instructions, class MITCampus(Campus) has (self, center_loc, tent_loc = Location(0,0)) and it is mentioned that center_loc and tent_loc are Location objects but when I code them as Locations, I get error from the tester: > > Traceback (most recent call last): > > File "submission.py", line 61, in __init__ > > self.cloc=Location(center_loc) > > TypeError: __init__() missing 1 required positional argument: 'y' > > > > -=-=-=-== > > > > Please help > > > Location.__init__ expects 3 arguments: self, x, y > > self is already provided, so that leaves 2 arguments: x, y > > You're giving it only 1 argument: center_loc > > What is center_loc? Is it a tuple? > > If it is, then you could do: > > self.cloc=Location(center_loc[0], center_loc[1]) > > or: > > self.cloc=Location(*center_loc) Hi, thanks for replying. As per the description it is a Location object. And that result is from the tester so it should have provided two values if it was expecting it to be a location object. From cheshirephoenix37 at gmail.com Sat Jul 29 23:08:45 2017 From: cheshirephoenix37 at gmail.com (cheshirephoenix37 at gmail.com) Date: Sat, 29 Jul 2017 20:08:45 -0700 (PDT) Subject: Direct Download Movies - No Download Limits - Download DivX DVD Movies In-Reply-To: <0fd824d6-62cb-4f5b-bc91-274e477ae0f5@w19g2000pre.googlegroups.com> References: <0fd824d6-62cb-4f5b-bc91-274e477ae0f5@w19g2000pre.googlegroups.com> Message-ID: <3f53a5e5-3a99-4c0f-ade1-586d148a6aa4@googlegroups.com> On Saturday, December 5, 2009 at 8:52:52 PM UTC-8, hussain dandan wrote: > Movie Download Reviews offers Free Online Movie Download,Hollywood > Movie Download,Free Full Movie Download,Download Latest Hollywood > Movies,Free Movie > > http://hollywood-moives.blogspot.com/ > http://hollywood-moives.tk cant login there's no where to login too plus the movies wont download fast From rosuav at gmail.com Sat Jul 29 23:21:30 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 30 Jul 2017 13:21:30 +1000 Subject: Direct Download Movies - No Download Limits - Download DivX DVD Movies In-Reply-To: <3f53a5e5-3a99-4c0f-ade1-586d148a6aa4@googlegroups.com> References: <0fd824d6-62cb-4f5b-bc91-274e477ae0f5@w19g2000pre.googlegroups.com> <3f53a5e5-3a99-4c0f-ade1-586d148a6aa4@googlegroups.com> Message-ID: On Sun, Jul 30, 2017 at 1:08 PM, wrote: > On Saturday, December 5, 2009 at 8:52:52 PM UTC-8, hussain dandan wrote: >> Movie Download Reviews offers Free Online Movie Download,Hollywood >> Movie Download,Free Full Movie Download,Download Latest Hollywood >> Movies,Free Movie >> >> [links deleted] > > cant login there's no where to login too plus the movies wont download fast Please don't reply to spam, especially not with the original links intact. Most of us don't see such blatant junk, as it gets filtered out; but then someone replies, and we see it. These kinds of web sites are generally illegal AND are often not giving you what they claim to be. Don't touch them. If you MUST download pirated movies, at least use a reputable source. (And no, I'm not going to name a reputable source, because that would be just as bad.) Even better, just get a relatively inexpensive online streaming services, and watch stuff legitimately. Or don't bother, given how low quality a lot of movies are these days... ChrisA From tjreedy at udel.edu Sun Jul 30 02:01:00 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 30 Jul 2017 02:01:00 -0400 Subject: YAML in std lib? In-Reply-To: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> References: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> Message-ID: On 7/29/2017 11:27 AM, Goldstein wrote: > Hello. > I'm new in this mailing list and, in fact, I've registered for one simple question. > Why YAML is not yet included in the standard Python library? > It's the most pythonic markup language, I think, and it's pretty popular. You can get yaml package(s) on pypi.python.org and probably install with pip. -- Terry Jan Reedy From steve+python at pearwood.info Sun Jul 30 02:20:42 2017 From: steve+python at pearwood.info (Steve D'Aprano) Date: Sun, 30 Jul 2017 16:20:42 +1000 Subject: YAML in std lib? References: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> Message-ID: <597d7abd$0$1591$c3e8da3$5496439d@news.astraweb.com> On Sun, 30 Jul 2017 01:27 am, Goldstein wrote: > Hello. > I'm new in this mailing list and, in fact, I've registered for one simple > question. Why YAML is not yet included in the standard Python library? > It's the most pythonic markup language, I think, and it's pretty popular. There are many reasons why a particular library may not be included in the standard library: 1. Perhaps nobody has thought of it. 2. The author of the library refused to allow it, or demanded conditions which the Python developers either cannot or will not meet. 3. The library is under rapid development with a release cycle faster than Python's standard library. 4. There may be technical reasons (e.g. code quality, external dependencies) why it isn't added. 5. There may be no consensus among the core developers that this library is important enough to include. 6. Or no agreement about which library to use (if there are more than one). 7. The library is for too small a niche to bother. 8. There may be problems with the legal status of the library, including legality of the software, copyright, patents, etc. 9. Or it may be only available under a proprietary, closed-source licence that is incompatible with Python's open source licence. 10. There may be nobody willing to maintain the library once it is accepted. I'm not sure which ones apply to YAML. If I were to guess, my guess would be either 5 or 6. -- Steve ?Cheer up,? they said, ?things could be worse.? So I cheered up, and sure enough, things got worse. From alister.ware at ntlworld.com Sun Jul 30 05:57:23 2017 From: alister.ware at ntlworld.com (alister) Date: Sun, 30 Jul 2017 09:57:23 GMT Subject: Direct Download Movies - No Download Limits - Download DivX DVD Movies Message-ID: <74ifB.641614$lu5.51171@fx42.am4> On Sun, 30 Jul 2017 13:21:30 +1000, Chris Angelico wrote: > On Sun, Jul 30, 2017 at 1:08 PM, wrote: >> On Saturday, December 5, 2009 at 8:52:52 PM UTC-8, hussain dandan >> wrote: >>> Movie Download Reviews offers Free Online Movie Download,Hollywood >>> Movie Download,Free Full Movie Download,Download Latest Hollywood >>> Movies,Free Movie >>> >>> [links deleted] >> >> cant login there's no where to login too plus the movies wont download >> fast > > Please don't reply to spam, especially not with the original links > intact. Most of us don't see such blatant junk, as it gets filtered out; > but then someone replies, and we see it. > > These kinds of web sites are generally illegal AND are often not giving > you what they claim to be. Don't touch them. If you MUST download > pirated movies, at least use a reputable source. (And no, I'm not going > to name a reputable source, because that would be just as bad.) Even > better, just get a relatively inexpensive online streaming services, and > watch stuff legitimately. > > Or don't bother, given how low quality a lot of movies are these days... > > ChrisA Reputable Pirate, now there is an oxymoron ;-) -- It is the quality rather than the quantity that matters. -- Lucius Annaeus Seneca From mohmmedmohmmedalagmyabdalrhman at gmail.com Sun Jul 30 06:00:31 2017 From: mohmmedmohmmedalagmyabdalrhman at gmail.com (mohmmedmohmmedalagmyabdalrhman at gmail.com) Date: Sun, 30 Jul 2017 03:00:31 -0700 (PDT) Subject: =?UTF-8?B?2YXYs9in2KjZgtipINiq2LnZitmK2YYgMTgwMCDZhdi52YTZhSDZgdmKINis2YXZiti5IA==?= =?UTF-8?B?2KfZhNmF2K3Yp9mB2LjYp9iqIDIwMTcg2KrYudix2YEg2LnZhNmKINin2YTYqtmB2KfYtdmK2YQg2Yg=?= =?UTF-8?B?2KfZhNi02LHZiNi3?= Message-ID: <711af556-a413-4955-8efc-f16cd415ced4@googlegroups.com> ?????? ????? 1800 ???? ?? ???? ????????? 2017 ???? ??? ???????? ??????? http://q.gs/Dpc17 From mohmmedmohmmedalagmyabdalrhman at gmail.com Sun Jul 30 06:01:42 2017 From: mohmmedmohmmedalagmyabdalrhman at gmail.com (mohmmedmohmmedalagmyabdalrhman at gmail.com) Date: Sun, 30 Jul 2017 03:01:42 -0700 (PDT) Subject: =?UTF-8?B?2YXYs9in2KjZgtipINin2YTYqtix2KjZitipINmI2KfZhNiq2LnZhNmK2YUg2YTYqti5?= =?UTF-8?B?2YrZitmGIDE4MDAg2YXYudmE2YUg2KzYr9mK2K8g2LnZhNmJINmF2LPYqtmI2Ykg2KzZhdmK2Lkg2Kc=?= =?UTF-8?B?2YTZhdit2KfZgdi42KfYqg==?= Message-ID: <099f37c3-2ea0-48d5-a46e-3fb786226f8c@googlegroups.com> ?????? ??????? ???????? ?????? 1800 ???? ???? ??? ????? ???? ????????? http://q.gs/Dpc1H From rosuav at gmail.com Sun Jul 30 07:27:13 2017 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 30 Jul 2017 21:27:13 +1000 Subject: Direct Download Movies - No Download Limits - Download DivX DVD Movies In-Reply-To: <74ifB.641614$lu5.51171@fx42.am4> References: <74ifB.641614$lu5.51171@fx42.am4> Message-ID: On Sun, Jul 30, 2017 at 7:57 PM, alister via Python-list wrote: > On Sun, 30 Jul 2017 13:21:30 +1000, Chris Angelico wrote: >> These kinds of web sites are generally illegal AND are often not giving >> you what they claim to be. Don't touch them. If you MUST download >> pirated movies, at least use a reputable source. (And no, I'm not going >> to name a reputable source, because that would be just as bad.) Even >> better, just get a relatively inexpensive online streaming services, and >> watch stuff legitimately. >> >> Or don't bother, given how low quality a lot of movies are these days... >> >> ChrisA > > Reputable Pirate, now there is an oxymoron ;-) > > Not at all. Guybrush Threepwood? is most definitely a reputable pirate. ChrisA From rantingrickjohnson at gmail.com Sun Jul 30 10:42:23 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 30 Jul 2017 07:42:23 -0700 (PDT) Subject: Need some advice please In-Reply-To: References: <1lL1zq18y0kwNzVcvCHkN6esH-R3DP59hLVInGHeEPDsgGW9ZLQ-qbWoQSpKQxLV8XM7Df6cfCtiHohEG7Fm1TO_aIanqnVNHPwoz4LBIqw=@protonmail.com> <597c560e$0$1618$c3e8da3$5496439d@news.astraweb.com> <79c0cb53-312f-2218-c85b-839fcaaf6892@mrabarnett.plus.com> Message-ID: <9d12541a-51b2-4ef4-96dc-949e2c77facb@googlegroups.com> On Saturday, July 29, 2017 at 10:24:20 PM UTC-5, MRAB wrote: > What is the difference between (1) and (4)? Case in point. ;-) From python at mrabarnett.plus.com Sun Jul 30 12:48:46 2017 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 30 Jul 2017 17:48:46 +0100 Subject: Issues with Python In-Reply-To: References: Message-ID: <91f63cb2-7417-fb51-4ef4-1c7fc3a68b60@mrabarnett.plus.com> On 2017-07-30 22:31, Ode Idoko via Python-list wrote: > Hi, I am new to Python and though I have been able to download the 3.6 version on my laptop , I still have issues with the syntax. While writing a program to execute, it will display syntax error with different shades of color usually green or yellow. > What can I do about this? How do I know the error and effect it? Can't it be programmed like we have in excel that will tell you error and prompt you if you wish to accept the right formula format? > Please I need more information on this. > Thanks. > Ode > If there's a syntax error, it'll tell you what the error is. If it's merely showing parts of a line in different colours, that's "syntax colouring". In IDLE, for example, comments are shown in red and string literals are shown in green. From irmen.NOSPAM at xs4all.nl Sun Jul 30 12:54:21 2017 From: irmen.NOSPAM at xs4all.nl (Irmen de Jong) Date: Sun, 30 Jul 2017 18:54:21 +0200 Subject: Issues with Python In-Reply-To: References: Message-ID: <597e0f3d$0$784$e4fe514c@news.xs4all.nl> On 30/07/2017 23:31, Ode Idoko wrote: > Hi, I am new to Python and though I have been able to download the 3.6 version on my laptop , I still have issues with the syntax. While writing a program to execute, it will display syntax error with different shades of color usually green or yellow. > What can I do about this? How do I know the error and effect it? Can't it be programmed like we have in excel that will tell you error and prompt you if you wish to accept the right formula format? > Please I need more information on this. > Thanks. > Ode > > Sent from my iPhone > You'll have to learn the language if you want to do anything meaningful with it. Python isn't anything like excel formula's, it is a full general multipurpose programming language. If you make a mistake, it usually is almost impossible to deduce what you meant to write instead. (because otherwise we wouldn't have to program our computers anymore and instead let them figure out automatically what we wanted to do, right? Just joking) Because you are comparing it to excel formulas: are you sure you've chosen the right tool for whatever the task is that you wanted to solve? If so: may I suggest first working through the Python tutorial. https://docs.python.org/3/tutorial/index.html Irmen From rantingrickjohnson at gmail.com Sun Jul 30 13:58:26 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 30 Jul 2017 10:58:26 -0700 (PDT) Subject: Need help to understand not the answer In-Reply-To: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> References: <8f37be2b-2692-4203-a24b-6919aeee42c3@googlegroups.com> Message-ID: <5161c9d0-76ca-43fc-a56d-405754a9c3a5@googlegroups.com> On Saturday, July 29, 2017 at 2:16:36 PM UTC-5, new_to_c0ding wrote: > Hello all, I have been scratching my head since morning but > could not understand this quiz question. I would appreciate > if someone could help me understand what is it asking me to > do. I dont need the answer but just the right direction to > look at. Hello. First of all, as i look over this "script template" that your instructor has supposedly provided, i am disgusted by the abysmal formatting. I'm not sure if what you provided here is an exact replica, or something that you have modified, but in either case, a little bit of proper formatting can go a loooooong way towards readability. For instance, When writing in natural languages (such as English), we utilize common structural elements and rules so that our text will be presented in manner that is recognizable to most readers. A few of these "high level structural components" include _spaces_, _sentences_ and _paragraphs_. And when writing code, we also utilize a "common structure". And one of the most important components of this "common structure" is the use of vertical whitespace. By properly utilizing vertical whitespace, we can separate the "paragraphs" of our code (aka: classes and functions) so that reading the code will be more intuitive. Of course, Structuring code is far more complicated than simply creating "visual buffers zones" around classes and functions, and many hours have been spent debating what is proper, and what is not. But being that in the "realms of the OOP paradigm" classes and methods are the most fundamental elements, it should come as no surprise that mastering the formatting of these elements is a vital first lesson. Now, Even though there is something of an "ideological war" raging as to exactly how much vertical whitespace should be used, and _where_ it should be used, most programmers will agree that the following example is acceptable. A common style is to place *ONE* vertical whitespace between each method in a class,and two vertical whitespaces between the classes themselves. Observe the following... ## BEGIN: READABLE CODE EXAMPLE ## class Location(object): """DOCSTRING MISSING!!!""" def __init__(self, x, y): self.x = x self.y = y def move(self, deltaX, deltaY): return Location(self.x + deltaX, self.y + deltaY) def getX(self): return self.x def getY(self): return self.y def dist_from(self, other): xDist = self.x - other.x yDist = self.y - other.y return (xDist**2 + yDist**2)**0.5 def __eq__(self, other): return (self.x == other.x and self.y == other.y) def __str__(self): return '<' + str(self.x) + ',' + str(self.y) + '>' class Campus(object): """DOCSTRING MISSING!!!""" def __init__(self, center_loc): self.center_loc = center_loc def __str__(self): return str(self.center_loc) class MITCampus(Campus): """ A MITCampus is a Campus that contains tents """ def __init__(self, center_loc, tent_loc=Location(0,0)): """ Assumes center_loc and tent_loc are Location objects Initializes a new Campus centered at location center_loc with a tent at location tent_loc """ # Your code here def add_tent(self, new_tent_loc): """ Assumes new_tent_loc is a[n *INSTANCE* of] Location Adds new_tent_loc to the campus only if the tent is at least 0.5 distance away from all other tents already there. Campus is unchanged otherwise. Returns True if it could add the tent, False otherwise. """ # Your code here def remove_tent(self, tent_loc): """ Assumes tent_loc is a[n *INSTANCE* of] Location Removes tent_loc from the campus. Raises a ValueError if there is not a tent at tent_loc. Does not return anything """ # Your code here def get_tents(self): """ Returns a list of all tents on the campus. The list should contain the string representation of the Location of a tent. The list should be sorted by the x coordinate of the location. """ # Your code here ## END: READABLE CODE EXAMPLE ## But what is most important to remember here is _not_ so much the _number_ of spaces used, but that the number is greater than _zero_, and that the spacing is _consistent_. For instance: if one feels that two spaces between methods is more desirable then that is okay, but, one should maintain the two space buffer between *ALL* methods in the script *AND* furthermore, expand the buffer between classes to four spaces -- because consistency is the key! But this code is not only lacking an intuitive format, it is also lacking an intelligent design. For instance: this object model is just begging for a "Tent object" with the Location being an attribute of each individual "Tent". Because it makes absolutely no sense for the MITCampus interface to present an "add_tent" method that expects a Location object as argument. This is illogic! Because locations are *NOT* Tents! Observe the following. ## BEGIN: SCHOOLING OF THE IDIOT PROFESSOR ## class Tent(object): def __init__(self, location): if not isinstance(location, Location): raise Exception('Expected an instance of Location!') self.location = location def upsticks(self, x, y): self.location.move(x, y) ## END: SCHOOLING OF THE IDIOT PROFESSOR (for now...) ## Now we have a proper representation of the "players" in this "game". And should it become necessary, at some point in the future, the "Tent" class can easily be extended to include such natural attributes as "name" and/or a "number of occupants" (which, depending on the needs of the model, "occupants" could be represented using a simple integer value (aka: tally) or a whole new Occupant class can be created to represent each individual occupant of the Tent.) I'm sorry to be one to inform you, but judging from the evidence presented here, your professor is not qualified to teach OOP principles. If One is to master the OOP paradigm, One must always construct an object model utilizing a design that, first and foremost, favors an ease of extensibility. However, your professor has neglected to follow this most fundamental of design considerations, thereby, rendering this lesson to be nothing more than another fine example of the blind leading the blind. Which in today's abysmal academic atmosphere, is far more common than we'd care to admit. But do not allow your unfortunate academic circumstances to discourage you. For although we have little control over who will be our instructors, or our superiors, we will always maintain control over own *OWN* capacity to learn. And if we are determined to become great students of knowledge (which i hope you are), then we must adopt an aggressive attitude to seek out knowledge in diverse places, if necessary, utilizing our sheer "force of will" alone. So if our instructor is incompetent (as they all to often are), then we will circumvent that instructors, violently kicking down the doors of hidden knowledge, if necessary, with the focused resolve that we will not be denied the knowledge for which we, as human beings, deserve. The point i'm trying to make here (with my dissection of your professor's incompetence), is that if you want to become competent at anything (not just writing code), you must realize that no one can _teach_ you anything. All an instructor can ever hope to do is to _lead_ you in a general direction, and _hope_ that you will gleam the valuable insights along the way. There are vast online sources where one can study a particular field, and unlike the formal classroom setting where you are limited by the competence and dogma of a particular instructor, online resources allow you to learn at rate that is comfortable to you, but more importantly, you will be exposed to many diverse opinions and methods of solving problems that are just not available in the orthodox setting of the "despotic classroom". And although, to gain your degree, you may have not choice but to "play along" with this unqualified professor, your knowledge need not be limited by the incompetence of the aforementioned idiot . Contrary to popular belief, i don't participate in online communities to help others, no, helping others is a secondary, and sometimes merely a consequence, of me developing my own analytical skills. Participating in online communities is sort of like a classroom, except that the "environmental motivation to excel" is not a result of dictates, but of "competition between the members". And it should be understood that it is the _competitive_force_ which injects energy into the social equation, allowing the cream to rise naturally to the top. This is how our Universe operates, and the fruits of success are all around us. Because any system that forsakes the unlimited energy source for which competition provides, can ever only hope to achieve mediocrity. The "ministries of education" are where the intellectual potential and the innovative spirit of a young mind goes to die. Please don't allow yourself to become a victim of this malevolent indoctrination. From jobmattcon at gmail.com Sun Jul 30 15:24:29 2017 From: jobmattcon at gmail.com (Ho Yeung Lee) Date: Sun, 30 Jul 2017 12:24:29 -0700 (PDT) Subject: how to group by function if one of the group has relationship with another one in the group? In-Reply-To: References: Message-ID: <1d6f7a4f-86f7-4827-8eb4-b74fa6c8ba40@googlegroups.com> which function should be used for this problem? On Saturday, July 29, 2017 at 11:02:30 PM UTC+8, Piet van Oostrum wrote: > Peter Otten <__peter__ at web.de> writes: > > > Ho Yeung Lee wrote: > > > >> from itertools import groupby > >> > >> testing1 = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > >> def isneighborlocation(lo1, lo2): > >> if abs(lo1[0] - lo2[0]) == 1 or lo1[1] == lo2[1]: > >> return 1 > >> elif abs(lo1[1] - lo2[1]) == 1 or lo1[0] == lo2[0]: > >> return 1 > >> else: > >> return 0 > >> > >> groupda = groupby(testing1, isneighborlocation) > >> for key, group1 in groupda: > >> print key > >> for thing in group1: > >> print thing > >> > >> expect output 3 group > >> group1 [(1,1)] > >> group2 [(2,3),(2,4] > >> group3 [(3,5),(3,6),(4,6)] > > > > groupby() calculates the key value from the current item only, so there's no > > "natural" way to apply it to your problem. > > > > Possible workarounds are to feed it pairs of neighbouring items (think > > zip()) or a stateful key function. Below is an example of the latter: > > > > $ cat sequential_group_class.py > > from itertools import groupby > > > > missing = object() > > > > class PairKey: > > def __init__(self, continued): > > self.prev = missing > > self.continued = continued > > self.key = False > > > > def __call__(self, item): > > if self.prev is not missing and not self.continued(self.prev, item): > > self.key = not self.key > > self.prev = item > > return self.key > > > > def isneighborlocation(lo1, lo2): > > x1, y1 = lo1 > > x2, y2 = lo2 > > dx = x1 - x2 > > dy = y1 - y2 > > return dx*dx + dy*dy <= 1 > > > > items = [(1,1),(2,3),(2,4),(3,5),(3,6),(4,6)] > > > > for key, group in groupby(items, key=PairKey(isneighborlocation)): > > print key, list(group) > > > > $ python sequential_group_class.py > > False [(1, 1)] > > True [(2, 3), (2, 4)] > > False [(3, 5), (3, 6), (4, 6)] > > That only works if > (a) The elements in the list are already clustered on group (i.e. all > elements of a group are adjacent) > (b) In a group the order is such that adjacent elements are direct > neigbours, i.e. their distance is at most 1. > > So 'groupby' is not a natural solution for this problem. > -- > Piet van Oostrum > WWW: http://piet.vanoostrum.org/ > PGP key: [8DAE142BE17999C4] From idokolord at yahoo.com Sun Jul 30 17:31:01 2017 From: idokolord at yahoo.com (Ode Idoko) Date: Sun, 30 Jul 2017 14:31:01 -0700 Subject: Issues with Python Message-ID: Hi, I am new to Python and though I have been able to download the 3.6 version on my laptop , I still have issues with the syntax. While writing a program to execute, it will display syntax error with different shades of color usually green or yellow. What can I do about this? How do I know the error and effect it? Can't it be programmed like we have in excel that will tell you error and prompt you if you wish to accept the right formula format? Please I need more information on this. Thanks. Ode Sent from my iPhone From rantingrickjohnson at gmail.com Sun Jul 30 18:19:47 2017 From: rantingrickjohnson at gmail.com (Rick Johnson) Date: Sun, 30 Jul 2017 15:19:47 -0700 (PDT) Subject: Issues with Python In-Reply-To: References: Message-ID: <744a1821-f5a3-4990-9f94-fc2261705217@googlegroups.com> > Hi, I am new to Python and though I have been able to > download the 3.6 version on my laptop , I still have > issues with the syntax. While writing a program to > execute, it will display syntax error with different > shades of color usually green or yellow. Which IDE (aka: fancy text editor) are you using to write this code? > What can I do about this? About what? (!) Do you want to change the colors of the syntax hilighting? (2) Do you want to know how to prevent or fix syntax errors? (3) Something else entirely? Unfortunately last weekend i had to pawn my crystal ball to payoff the vig, (it seems my financier is not the type to just "fuggetaboutit") so i'm afraid you'll have to be a _little_ more specific when asking questions. But seriously. :-) Syntax hilighting is used by some editors to help a programmer differentiate between certain elements of code. And this hilighting can be very helpful to the beginner. > How do I know the error and effect it? Oh, you'll know when you make a mistake, because Python will throw an error message. For instance, if you type "aaa" (without the quotes) at the Python command prompt, you'll receive a message that looks similar to this: Traceback (most recent call last): File "", line 1, in aaa NameError: name 'aaa' is not defined This one is easy to diagnose. But in order to understand Python exception messages you need to read some tutorials. If you think you can start writing code (even Python code) without some sort of "guide", then you're going to have a real difficult time. Even folks with prior programming experience need a guide. I imagine it would be like a delivery driver moving from Paris to London and trying to navigate around the streets of Merry ol' London using a map of Paris. It's just not practical. > Can't it be programmed like we have in excel that will tell > you error and prompt you if you wish to accept the right > formula format? By utilizing Python, or any other programming language that is available (and boy, there are quite a few of them!), you can program your computer to do just about anything you want. Maybe you could even write a real life HAL5000! But if you're expecting that your experience with Micrcosoft Excel formulas will translate over into Python, well, then, you're in a tough time. Excel is a "single purpose software" whereas Python is a "general purpose programming language". Using Python, you could write a spreadsheet program like Excel. Although I wouldn't suggest it, as there are already tons of them freely available, and Python is not the best language for something like that, and Microsoft would not be too happy about it. But you could do it if you were so inclined. My advice is that you visit the Python.org website and look through the list of tutorials for absolute beginners. After completing a few of these tutorials, you should be off and running in no time. Here is link: https://wiki.python.org/moin/BeginnersGuide/NonProgrammers After clicking the link, skip down to the section titled "Tutorials and Websites", and start with "One Day of IDLE Toying". IDLE is a "fancy text editor" that ships with Python, and this tutorial is very gentle on beginners. After you have finished "One Day of IDLE Toying", my next suggestion would be "A Byte of Python", which will introduce you to some basics. From there, you can go through all the other tutorials. When you have finished all the beginner tutorials, then take a stab at the advanced section. Here is a link https://wiki.python.org/moin/BeginnersGuide/Programmers And don't forget, Google is a programmers best friend. Who surprisingly, hardly ever complains. Which unfortunately, is something i cannot say for the women i have known. But i digress! O;-) From ben+python at benfinney.id.au Sun Jul 30 22:20:14 2017 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 31 Jul 2017 12:20:14 +1000 Subject: YAML in std lib? References: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> <597d7abd$0$1591$c3e8da3$5496439d@news.astraweb.com> Message-ID: <85mv7l2vq9.fsf@benfinney.id.au> Steve D'Aprano writes: > On Sun, 30 Jul 2017 01:27 am, Goldstein wrote: > > > I'm new in this mailing list and, in fact, I've registered for one > > simple question. Why YAML is not yet included in the standard Python > > library? It's the most pythonic markup language, I think, and it's > > pretty popular. > > There are many reasons why a particular library may not be included in the > standard library: Those are all reasons worth considering. Most of them, though, are reasons why a proposal to include something in the Python standard library might be *rejected* after being proposed for discussion. Including something in the standard library needs to be discussed in the context of a specific proposal. There is a procedure for that: the Python Enhancement Proposal . So, an important reason to consider: Perhaps no-one has developed and championed a specific Python Enhancement Proposal for including that in the standard library. I don't know of any PEP yet which specifies exactly what to add to the standard library for YAML (and how to ensure it continues to be maintained in the standard library). There may have been vague requests for ?can we have PyYAML in the standard library??, but those are void in this context, because that's not what actually gets the discussion looking at specifics. My advice to the original poster: Have a search for past proposals and see what discussion ensued. If there was no such proposal, try writing one and championing it. -- \ ?You say I took the name in vain / I don't even know the name / | `\ But if I did, well, really, what's it to you?? ?Leonard Cohen, | _o__) _Hallelujah_ | Ben Finney From soyeomul at doraji.xyz Sun Jul 30 22:39:28 2017 From: soyeomul at doraji.xyz (Byung-Hee HWANG =?utf-8?B?KO2Zqeuzke2drCwg6buD?= =?utf-8?B?54Kz54aZKQ==?=) Date: Mon, 31 Jul 2017 11:39:28 +0900 Subject: Issues with Python References: Message-ID: Ode Idoko via Python-list ?? ???, ??? ?? ???: > ... it will display syntax error with ... Maybe, you did copy & paste, just i guess ... if not, ignore it. -- ^????? _????_ ?????_^))// From j.clarke.873638 at gmail.com Sun Jul 30 23:00:58 2017 From: j.clarke.873638 at gmail.com (J. Clarke) Date: Sun, 30 Jul 2017 23:00:58 -0400 Subject: Where is python and idle? References: <306caf31-2a37-f70e-37d8-07c7e98385e0@brianlcase.com> Message-ID: In article , brian at brianlcase.com says... > > Thank you, That is where it is. Would not have found it without your > help. Now, to find IDLE. > > rgrds, > > Brian > > > On 7/21/2017 10:19 AM, Nathan Ernst wrote: > > Check your user folder. For me, on my PC, python is installed > > at C:\Users\nernst\AppData\Local\Programs\Python > > > > Regards, > > Nate > > > > On Fri, Jul 21, 2017 at 9:24 AM, Brian Case > > wrote: > > > > I am running windows 10 version 1703 as administrator on a Dell > > Inspiron 15 laptop. > > > > I downloaded and installed python 3.6.2 from > > https://www.python.org/downloads/ > > for windows. > > > > https://www.programiz.com/python-programming > > instructs me to > > open IDLE once that install completed. > > > > But I find NEITHER Python nor IDLE anywhere on my machine. > > > > I reran the install which gave me options for REPAIR. I ran it, > > which completed successfully and provided the python-list email > > address. > > > > I still cannot find an executable or a folder for anything > > beginning with Python. > > > > Where should I look besides folders C:\Program Files and > > C:\Program Files (x86)? > > > > Regards, > > > > Brian Case > > -- > > https://mail.python.org/mailman/listinfo/python-list > > > > > > Did you try "Hey, Cortana, launch idle"? Works for me. From sbassi at genesdigitales.com Sun Jul 30 23:46:20 2017 From: sbassi at genesdigitales.com (Sebastian Bassi) Date: Sun, 30 Jul 2017 20:46:20 -0700 Subject: Python for Bioinformatics: New book announcement Message-ID: I am glad to announce the second edition of Python for Bioinformatics. In today's data driven biology, programming knowledge is essential in turning ideas into testable hypothesis. Based on my extensive experience, Python for Bioinformatics, Second Edition helps biologists get to grips with the basics of software development. Requiring no prior knowledge of programming-related concepts, the book focuses on the easy-to-use, yet powerful, Python computer language. This new edition is updated throughout to Python 3 and is designed not just to help scientists master the basics, but to do more in less time and in a reproducible way. New developments added in this edition include NoSQL databases, the Anaconda Python distribution, graphical libraries like Bokeh, and the use of GitHub for collaborative development. Most of the code can be executed online using a collection of Jupyter Notebooks hosted at https://notebooks.azure.com/library/py3.us. All source code is available at GitHub (https://github.com/Serulab/Py4Bio) The intended audience of this book are bioinformatics students and graduates who are not software developers but needs to learn how to program. Software developers can also take advantage of the book, since there is also advanced and reference material. Table of contents Section I: Programming Chapter 1: ? Introduction Chapter 2 ? First Steps with Python Chapter 3 ? Basic Programming: Data Types Chapter 4 ? Programming: Flow Control Chapter 5 ? Handling Files Chapter 6 ? Code Modularizing Chapter 7 ? Error Handling Chapter 8 ? Introduction to Object Orienting Programming (OOP) Chapter 9 ? Introduction to Biopython Section II: Advanced Topics Chapter 10 ? Web Applications Chapter 11 ? XML Chapter 12 ? Python and Databases Chapter 13 ? Regular Expressions Chapter 14 ? Graphics in Python Section III: Python Recipes with Commented Source Code Chapter 15 ? Sequence Manipulation in Batch Chapter 16 ? Web Application for Filtering Vector Contamination Chapter 17 ? Searching for PCR Primers Using Primer3 Chapter 18 ? Calculating Melting Temperature from a Set of Primers Chapter 19 ? Filtering Out Specific Fields from a GenBank File Chapter 20 ? Inferring Splicing Sites Chapter 21 ? Web Server for Multiple Alignment Chapter 22 ? Drawing Marker Positions Using Data Stored in a Database Chapter 23 ? DNA Mutations with Restrictions (On-Line only) Section IV: Appendices Appendix A ? INTRODUCTION TO VERSION CONTROL Appendix B ? PYTHONANYWHERE Appendix C ? REFERENCE Where to buy? In Amazon (http://amzn.to/2vWazcL) or at the publisher web site (https://goo.gl/t2uoyN), save 20% with the promo code AZR94 (valid only in the publisher site and up to December 2017). For more information, include book mailing list, visit http://py3.us/ From p.f.moore at gmail.com Mon Jul 31 09:50:24 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Jul 2017 06:50:24 -0700 (PDT) Subject: YAML in std lib? In-Reply-To: <597d7abd$0$1591$c3e8da3$5496439d@news.astraweb.com> References: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> <597d7abd$0$1591$c3e8da3$5496439d@news.astraweb.com> Message-ID: <039bb5e7-d631-4970-9700-0d2768ef0d44@googlegroups.com> On Sunday, 30 July 2017 07:21:00 UTC+1, Steve D'Aprano wrote: > 1. Perhaps nobody has thought of it. > > 2. The author of the library refused to allow it, or demanded > conditions which the Python developers either cannot or will > not meet. > > 3. The library is under rapid development with a release cycle > faster than Python's standard library. > > 4. There may be technical reasons (e.g. code quality, external > dependencies) why it isn't added. > > 5. There may be no consensus among the core developers that this > library is important enough to include. > > 6. Or no agreement about which library to use (if there are > more than one). > > 7. The library is for too small a niche to bother. > > 8. There may be problems with the legal status of the library, > including legality of the software, copyright, patents, etc. > > 9. Or it may be only available under a proprietary, closed-source > licence that is incompatible with Python's open source licence. > > 10. There may be nobody willing to maintain the library once it > is accepted. > > > I'm not sure which ones apply to YAML. If I were to guess, my guess would be > either 5 or 6. I'd add 4 and 10 as possible issues. Getting a 3rd party library included into the stdlib also needs to pass the test of "why isn't it sufficient to depend on a library on PyPI?" As I understand it, YAML as a markup language is not as popular as it once was. There have been concerns expressed about its complexity, and the spec seems to have stagnated (the last update noted yaml.org was in 2011, and many libraries still seem to stick to YAML 1.1, when 1.2 has been out for years). In general, I get the impression that YAML is a great idea if you stick to "the bits that I like" - but unfortunately, different people like different bits :-) So the first issue is that there's not enough momentum behind YAML as a markup standard to warrant it being in the stdlib. In addition, the main YAML library on PyPI is PyYAML, and my understanding is that it had some reliability issues (crashes on certain malformed input?) that took a long time to get fixed. So there's code quality and maintenance commitment issues to be addressed (it's entirely possible that those issues are now resolved, but that needs to be confirmed). Furthermore, it uses a C library (libyaml) as an accelerator, which makes the process of bringing it into the stdlib even more complex. Long story short - there's not enough benefit over "pip install pyyaml" to justify it, even if someone (either the library author, or someone with the author's support) had spent the time putting together a concrete proposal. Paul From p.f.moore at gmail.com Mon Jul 31 09:53:52 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 31 Jul 2017 06:53:52 -0700 (PDT) Subject: zipapp should not include temporary files? In-Reply-To: <597b6bd9$0$752$e4fe514c@news.xs4all.nl> References: <5978d330$0$834$e4fe514c@news.xs4all.nl> <9c536053-b7a8-4829-92f3-6099e84a726d@googlegroups.com> <597b67f3$0$800$e4fe514c@news.xs4all.nl> <597b6bd9$0$752$e4fe514c@news.xs4all.nl> Message-ID: On Friday, 28 July 2017 17:53:08 UTC+1, Irmen de Jong wrote: > On 28/07/2017 18:36, Irmen de Jong wrote: > > On 27/07/2017 00:03, Paul Moore wrote: > >> If you want to create a feature request for a filter function on bugs.python.org and assign it to me, I'll take a look at it. \ > > > > > > I will do this, thanks in advance. > > Should have included a link perhaps. Here it is: http://bugs.python.org/issue31072 Thanks, I've picked it up and commented there. I'll try to get some time to look at an implementation, the main constraint at the moment is that it's holiday season and I'm enjoying myself with things other than coding :-) Paul From skip.montanaro at gmail.com Mon Jul 31 10:33:45 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Mon, 31 Jul 2017 09:33:45 -0500 Subject: YAML in std lib? In-Reply-To: <039bb5e7-d631-4970-9700-0d2768ef0d44@googlegroups.com> References: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> <597d7abd$0$1591$c3e8da3$5496439d@news.astraweb.com> <039bb5e7-d631-4970-9700-0d2768ef0d44@googlegroups.com> Message-ID: In addition to the reasons given by Steve, Paul, and others, the barrier to entry to the standard library has grown as the domains to which Python is applied have increased. To justify the maintenance effort, when considering a module or package for inclusion in the standard library, you want it to be as broadly useful as possible. Is YAML widely enough used in domains as varied as web application development, bioinformatics, and machine learning to justify its inclusion in the standard library? Maybe not. In addition, I suspect more and more people are using virtual environments of one sort or another. When constructing such environments it's pretty trivial to tailor them to contain just those modules and packages appropriate to a particular task. I use Conda environments almost exclusively these days. It frees me from the glacial pace of updates to the default Python installation at work (stuck on 2.7.2). Skip From lele at metapensiero.it Mon Jul 31 11:48:47 2017 From: lele at metapensiero.it (Lele Gaifax) Date: Mon, 31 Jul 2017 17:48:47 +0200 Subject: YAML in std lib? References: <61f95827-1fc9-4fad-a962-3df282bf317f@gmail.com> <597d7abd$0$1591$c3e8da3$5496439d@news.astraweb.com> <039bb5e7-d631-4970-9700-0d2768ef0d44@googlegroups.com> Message-ID: <87y3r4li8w.fsf@metapensiero.it> Paul Moore writes: > As I understand it, YAML as a markup language is not as popular as it once > was. Given all the hype around Docker these days, I'm not convinced that's true :) > In addition, the main YAML library on PyPI is PyYAML There is fork, https://pypi.python.org/pypi/ruamel.yaml, that seems actively maintained: it supports YAML 1.2 for example. But I agree with you, I do not see a good enough reason to include it in the standard library. ciao, lele. -- nickname: Lele Gaifax | Quando vivr? di quello che ho pensato ieri real: Emanuele Gaifas | comincer? ad aver paura di chi mi copia. lele at metapensiero.it | -- Fortunato Depero, 1929. From guru at digitalfantasy.it Mon Jul 31 14:12:01 2017 From: guru at digitalfantasy.it (Daniele Forghieri) Date: Mon, 31 Jul 2017 20:12:01 +0200 Subject: Issues with Python In-Reply-To: References: Message-ID: Il 30/07/2017 23:31, Ode Idoko via Python-list ha scritto: > Hi, I am new to Python and though I have been able to download the 3.6 version on my laptop , I still have issues with the syntax. While writing a program to execute, it will display syntax error with different shades of color usually green or yellow. > What can I do about this? How do I know the error and effect it? Can't it be programmed like we have in excel that will tell you error and prompt you if you wish to accept the right formula format? > Please I need more information on this. > Thanks. > Ode > > Sent from my iPhone If you use eclipse with the pydev extensions or, directly, liclipse you have (some) error and warnings, for example if you use a variable not defined or if you try to use a variable with the same name of a reserved symbol. It's not completed as you run the program but it's really helpful. Daniele Forghieri From hola903 at aol.com Mon Jul 31 17:11:52 2017 From: hola903 at aol.com (Sonja Williams) Date: Mon, 31 Jul 2017 17:11:52 -0400 Subject: Question Message-ID: <15d9a7da631-5a5a-37954@webprd-m01.mail.aol.com> Good Day, I have decided to learn more about programming so I picked up the book Beginning Programming by Matt Telles. After following the directions verbatim and going to the Python site to download the latest version 3, which is what the book recommended, I keep getting the following error message when running my first script. I am using Windows 7 - 64 bit " python is not recognized as an internal or external command, operable program, or batch file". I am at the end of chapter 3 attempting to run the following script. python ch3_1.py which should return What is your name? What am I doing wrong? Sonja Williams From rosuav at gmail.com Mon Jul 31 17:35:42 2017 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 1 Aug 2017 07:35:42 +1000 Subject: Question In-Reply-To: <15d9a7da631-5a5a-37954@webprd-m01.mail.aol.com> References: <15d9a7da631-5a5a-37954@webprd-m01.mail.aol.com> Message-ID: On Tue, Aug 1, 2017 at 7:11 AM, Sonja Williams via Python-list wrote: > > > > > Good Day, > > I have decided to learn more about programming so I picked up the book Beginning Programming by Matt Telles. After following the directions verbatim and going to the Python site to download the latest version 3, which is what the book recommended, I keep getting the following error message when running my first script. I am using Windows 7 - 64 bit > > " python is not recognized as an internal or external command, operable program, or batch file". > You may have run into a difficulty of Windows setups. Try instead typing "py ch3_1.py" - the Windows installer should have created a little bouncer program. If that works, use that for the rest of the book - any time you're told to run "python", run "py" instead, with the same parameters. ChrisA From oliver.schoenborn at gmail.com Mon Jul 31 18:09:42 2017 From: oliver.schoenborn at gmail.com (oliver) Date: Mon, 31 Jul 2017 22:09:42 +0000 Subject: Question In-Reply-To: <15d9a7da631-5a5a-37954@webprd-m01.mail.aol.com> References: <15d9a7da631-5a5a-37954@webprd-m01.mail.aol.com> Message-ID: On Mon, Jul 31, 2017, 17:33 Sonja Williams via Python-list, < python-list at python.org> wrote: > > > > > Good Day, > > I have decided to learn more about programming so I picked up the book > Beginning Programming by Matt Telles. After following the directions > verbatim and going to the Python site to download the latest version 3, > which is what the book recommended, I keep getting the following error > message when running my first script. I am using Windows 7 - 64 bit > > " python is not recognized as an internal or external command, operable > program, or batch file". > > > I am at the end of chapter 3 attempting to run the following script. > > python ch3_1.py which should return What is your name? > > > What am I doing wrong? > The window in which you are typing the command to run the script is called a "shell". The error you are getting means the shell does not have a built-in command (like cd or dir or help) called "python", nor can it find "python.bat" or "python.exe" in any of the folders known to it. To check which folders are known to the shell, type "path" in it; the semicolon-separated list of folders that this prints will likely be missing the Python folder. I presume you installed python 3.6? There is an option to add the Python folder to path as part of the installation, so you might want to uninstall python, then re-run the installer this time paying close attention to this configuration option. For example in Python 3.5 installer the option is called "Add Python 3.5 to path", see the first snapshot at https://docs.python.org/3/using/windows.html. With this option in effect, the output from the "path" command will include the Python folder. So when you type "python ch3.py" the shell will see a python.exe in Python folder and run it and all will work! Oliver > > > > > > Sonja Williams > -- > https://mail.python.org/mailman/listinfo/python-list > -- Oliver My StackOverflow contributions My CodeProject articles My Github projects My SourceForget.net projects From tom at tomforb.es Mon Jul 31 19:31:25 2017 From: tom at tomforb.es (tom at tomforb.es) Date: Mon, 31 Jul 2017 16:31:25 -0700 (PDT) Subject: @lru_cache on functions with no arguments Message-ID: As part of the Python 3 cleanup in Django there are a fair few uses of @functools.lru_cache on functions that take no arguments. A lru_cache isn't strictly needed here, but it's convenient to just cache the result. Some examples are here: https://github.com/django/django/pull/8825/files I did some profiling and I found that using `@lru_cache(maxsize=None)` on such functions is twice as fast as a standard `@lru_cache()`, apparently because with a `maxsize` the lru_cache code requires a lock acquisition and a fair bit more state to track. Am I right in thinking that using `maxsize=None` is best for functions that accept no arguments? Should we even be using a `lru_cache` in such situations, or write our own simple cache decorator instead? From codewizard at gmail.com Mon Jul 31 20:00:22 2017 From: codewizard at gmail.com (codewizard at gmail.com) Date: Mon, 31 Jul 2017 17:00:22 -0700 (PDT) Subject: @lru_cache on functions with no arguments In-Reply-To: References: Message-ID: <0cd629cc-f1d9-40b7-82a0-6079e9c64dac@googlegroups.com> On Monday, July 31, 2017 at 7:31:52 PM UTC-4, t... at tomforb.es wrote: > As part of the Python 3 cleanup in Django there are a fair few uses of @functools.lru_cache on functions that take no arguments. A lru_cache isn't strictly needed here, but it's convenient to just cache the result. Some examples are here: https://github.com/django/django/pull/8825/files > > I did some profiling and I found that using `@lru_cache(maxsize=None)` on such functions is twice as fast as a standard `@lru_cache()`, apparently because with a `maxsize` the lru_cache code requires a lock acquisition and a fair bit more state to track. > > Am I right in thinking that using `maxsize=None` is best for functions that accept no arguments? Should we even be using a `lru_cache` in such situations, or write our own simple cache decorator instead? If the performance savings are real, another choice would be to improve the implementation of lru_cache to special-case no-argument functions to avoid locks, etc. From tjreedy at udel.edu Mon Jul 31 21:32:10 2017 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 31 Jul 2017 21:32:10 -0400 Subject: @lru_cache on functions with no arguments In-Reply-To: References: Message-ID: On 7/31/2017 7:31 PM, tom at tomforb.es wrote: > As part of the Python 3 cleanup in Django there are a fair few uses of @functools.lru_cache on functions that take no arguments. This makes no sense to me. If the function is being called for side-effects, then it should not be cached. If the function is being called for a result, different for each call, calculated from a changing environment, then it should not be cached. (Input from disk is an example.) If the function returns a random number, or a non-constant value from an oracle (such as a person), it should not be cached. If the function returns a constant (possible calculated once), then the constant should just be bound to a name (which is a form of caching) rather than using the overkill of an lru cache. What possibility am I missing? -- Terry Jan Reedy