A while ago there was a discussion of the value of apis like str.swapcase, and it was suggested that even though it was acknowledged to be useless the effort of deprecating and removing it was thought to be more than the value in removing it.
Earlier this year I was at a pypy sprint helping to work on Python 2.7 compatibility. The bytearray type has much of the string interface, including swapcase… So there was effort to implement this method with the correct semantics for pypy. Doubtless the same has been true for IronPython, and will also be true for Jython.
Whilst it is too late for Python 2.x, it *is* (in my opinion) worth removing unused and unneeded APIs. Even if the effort to remove them is more than any effort saved on the part of users it helps other implementations down the road that no longer need to provide these APIs.
All the best,
May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
In Haskell I experienced a situation where dynamically loaded modules
were experiencing "invalid ELF header" errors. This was caused by the
module names actually referring to linker scripts rather than ELF
binaries. I patched the GHC runtime system to deal with these scripts.
I noticed that this same patch has been ported to Ruby and Node.js, so I
suggested to the libc developers that they might wish to incorporate the
patch into their library, making it available to all languages. They
rejected this suggestion, so I am making the suggestion to the Python
devs in case it is of interest to you.
Basically, when a linker script is loaded by dlopen, an "invalid ELF
header" error occurs. The patch checks to see if the file is a linker
script. If so, it finds the name of the real ELF binary with a regular
expression and tries to dlopen it. If successful, processing proceeds.
Otherwise, the original "invalid ELF error" message is returned.
If you want to add this code to Python, you can look at my original
patch (http://hackage.haskell.org/trac/ghc/ticket/2615) or the Ruby
version (https://github.com/ffi/ffi/pull/117) or the Node.js version
(https://github.com/rbranson/node-ffi/pull/5) to help port it.
Note that the GHC version in GHC 7.2.1 has been enhanced to also handle
another possible error when the linker script is too short, so you might
also want to add this enhancement also (see
https://github.com/ghc/blob/master/rts/Linker.c line 1191 for the
revised regular expression):
"(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too
At this point, I don't have the free time to write the Python patch
myself, so I apologize in advance for not providing it to you.
Howard B. Golden
Northridge, California, USA
-----BEGIN PGP SIGNED MESSAGE-----
A single instance of buildbot in the OpenIndiana buildbot is eating
1.4GB of RAM and 3.8GB of SWAP and growing.
The build hangs or die with a "out of memory" error, eventually.
This is 100% reproducible. Everytime I force a build thru the buildbot
control page, I see this: takes huge memory and dies with an "out of
memory" or hangs.
I am allocating 4GB to the buildbots.
I think this is not normal. I am the only one seen such a memory
usage?. I haven't changed anything in my buildbots for months...
Jesus Cea Avion _/_/ _/_/_/ _/_/_/
jcea(a)jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/
jabber / xmpp:firstname.lastname@example.org _/_/ _/_/ _/_/_/_/_/
. _/_/ _/_/ _/_/ _/_/ _/_/
"Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
"My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----END PGP SIGNATURE-----
I have implemented an initial version of PEP 393 -- "Flexible String
Representation" as part of my Google Summer of Code project. My patch
is hosted as a repository on bitbucket  and I created a related
issue on the bug tracker . I posted documentation for the current
state of the development in the wiki .
Current tests show a potential reduction of memory by about 20% and
CPU by 50% for a join micro benchmark. Starting a new interpreter
still causes 3244 calls to create compatibility Py_UNICODE
representations, 263 strings are created using the old API while 62719
are created using the new API. More measurements are on the wiki page
If there is interest, I would like to continue working on the patch
with the goal of getting it into Python 3.3. Any and all feedback is
On Tue, Sep 6, 2011 at 10:01 AM, victor.stinner
> Fix also spelling of the null character.
While these cases are legitimately changed to 'null' (since they're
lowercase descriptions of the character), I figure it's worth
mentioning again that the ASCII name for '\0' actually *is* NUL (i.e.
only one 'L'). Strange, but true .
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
I have been investing some 'tokenize' bugs recently. As a part of
that investigation I was trying to use '-m tokenize', which works
great in 2.x:
[meadori@motherbrain cpython]$ python2.7 -m tokenize test.py
1,0-1,5: NAME 'print'
1,6-1,21: STRING '"Hello, World!"'
1,21-1,22: NEWLINE '\n'
2,0-2,0: ENDMARKER ''
In 3.x, however, the functionality has been removed and replaced with
some hard-wired test code:
[meadori@motherbrain cpython]$ python3 -m tokenize test.py
TokenInfo(type=57 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line='')
TokenInfo(type=1 (NAME), string='def', start=(1, 0), end=(1, 3),
line='def parseline(self, line):')
TokenInfo(type=1 (NAME), string='parseline', start=(1, 4), end=(1,
13), line='def parseline(self, line):')
TokenInfo(type=53 (OP), string='(', start=(1, 13), end=(1, 14),
line='def parseline(self, line):')
Why is this? I found the commit where the functionality was removed
, but no explanation. Any objection to adding this feature back?
Le 06/09/2011 00:11, victor.stinner a écrit :
> changeset: 72296:56ab3257ca13
> user: Victor Stinner <victor.stinner(a)haypocalc.com>
> date: Tue Sep 06 00:11:13 2011 +0200
> Issue #9561: packaging now writes egg-info files using UTF-8
> instead of the locale encoding
> def _distutils_pkg_info(self):
> tmp = self._distutils_setup_py_pkg()
> - self.write_file([tmp, 'PKG-INFO'], '')
> + self.write_file([tmp, 'PKG-INFO'], '', encoding='UTF-8')
This function is writing an empty string; isn’t it the same bytes in
UTF-8 or in the locale encoding? (Are there people that use encodings
with BOMs as locale? *shudders*)