What terrible things would happen if ob_size would be changed from int
The question recently came up on comp.lang.python, where the poster
noticed that you cannot mmap large files on a 64-bit system where int
is 32 bits; there is a 2Gib limit on the length of objects on his
About the only problem I can see is that you could not store negative
numbers anymore. Is ssize_t universally available, or could be used on
systems where it is available?
Since the subject has come up several times recently,
and some one (Walter?) suggested a PEP be written....here goes.
Attached is a draft PEP. Comments?
Title: Backward Compatibility for Standard Library
Author: neal(a)metaslash.com (Neal Norwitz)
This PEP describes the packages and modules in the standard
library which should remain backward compatible with previous
versions of Python.
Authors have various reasons why packages and modules should
continue to work with previous versions of Python. In order to
maintain backward compatibility for these modules while moving the
rest of the standard library forward, it is necessary to know
which modules can be modified and which should use old and
possibly deprecated features.
Generally, authors should attempt to keep changes backward
compatible with the previous released version of Python in order
to make bug fixes easier to backport.
Backward Compatible Packages & Modules
Package/Module Maintainer(s) Python Version
-------------- ------------- --------------
distutils Andrew Kuchling 1.5.2
email Barry Warsaw 2.1
sre Fredrik Lundh 1.5.2
xml (PyXML) Martin v. Loewis 2.0
This document has been placed in the public domain.
Forgive me if this is slightly off-topic for this list, but since we've been
talking about migration guides and coding idioms and tweaking performance
and such, I've got a few questions I'd like to ask.
I'll start with an actual code sample. This is a very simple class that's
part of an xhtml toolkit I'm writing.
def __init__(self, content=''):
self.content = content
def __call__(self, content=''):
o = self.__class__(content)
return '<!-- %s -->' % self.content
When I look at this, I see certain decisions I've made and I'm wondering if
I've made the best decisions. I'm wondering how to balance performance
against clarity and proper coding conventions.
1. In the __call__ I save a reference to the object. Instead, I could
Is there much of a performance impact by explicitly naming intermediate
references? (I need some of Tim Peter's performance testing scripts.)
2. I chose the slightly indirect str(o) instead of o.__str__(). Is this
slower? Is one style preferred over the other and why?
3. I used a format string, '<!-- %s -->' % self.content, where I could just
as easily have concatenated '<!-- ' + self.content + ' -->' instead. Is one
faster than the other?
4. Is there any documentation that covers these kinds of issues where there
is more than one way to do something? I'd like to have some foundation for
making these decisions. As you can probably guess, I usually hate having
more than one way to do anything. ;-)
Patrick K. O'Brien
Hidden away in distutils.fancy_getopt is an exceedingly handy function
called wrap_text(). It does just what you might expect from the name:
def wrap_text (text, width):
"""wrap_text(text : string, width : int) -> [string]
Split 'text' into multiple lines of no more than 'width' characters
each, and return the list of strings that results.
Surprise surprise, Optik uses this. I've never been terribly happy
about importing it from distutils.fancy_getopt, and putting Optik into
the standard library as OptionParser is a great opportunity for putting
wrap_text somewhere more sensible.
I happen to think that wrap_text() is useful for more than just
auto-formatting --help messages, so hiding it away in OptionParser.py
doesn't seem right. Also, Perl has a Text::Wrap module that's been part
of the standard library for not-quite-forever -- so shouln't Python have
Proposal: a new standard library module, wrap_text, which combines the
best of distutils.fancy_getopt.wrap_text() and Text::Wrap. Right now,
I'm thinking of an interface something like this:
wrap(text : string, width : int) -> [string]
Split 'text' into multiple lines of no more than 'width' characters
each, and return the list of strings that results. Tabs in 'text'
are expanded with string.expandtabs(), and all other whitespace
characters (including newline) are converted to space.
[This is identical to distutils.fancy_getopt.wrap_text(), but the
docstring is more complete.]
wrap_nomunge(text : string, width : int) -> [string]
Same as wrap(), without munging whitespace.
[Not sure if this is really useful to expose publicly. Opinions?]
fill(text : string,
width : int,
initial_tab : string = "",
subsequent_tab : string = "")
Reformat the paragraph in 'text' to fit in lines of no more than
'width' columns. The first line is prefixed with 'initial_tab',
and subsequent lines are prefixed with 'subsequent_tab'; the
lengths of the tab strings are accounted for when wrapping lines
to fit in 'width' columns.
[This is just a glorified "\n".join(wrap(...)); the idea to add initial_tab
and subsequent_tab was stolen from Perl's Text::Wrap.]
I'll go whip up some code and submit a patch to SF. If people like it,
I'll even write some tests and documentation too.
Greg Ward - Unix nerd gward(a)python.net
Support bacteria -- it's the only culture some people have!
Zooko <zooko(a)zooko.com> writes:
> Since even python-dev'ers find ancient copies of trace.py on ftp
> sites before they find the one in the Tools/scripts/ directory, and
> since the debian package of Python 2.2 comes without a
> Tools/scripts/ directory at all, I conclude that the Tools/scripts/
> directory it isn't doing its job very well. I suggest it either be
> killed or fixed. I'm not sure how to do the latter -- link to it
> from the doc pages?
Kill it and I'll be after you with an axe. Admittedly there's a lot
of cruft that should be removed, rewritten or reorganised, but there's
some mighty useful stuff there too (eg. logmerge.py). I don't see why
I should suffer just because other people don't know where to look...
Darned confusing, unless you have that magic ingredient coffee, of
which I can pay you Tuesday for a couple pounds of extra-special
grind today. -- John Mitchell, 11 Jan 1999
Neal Norwitz <neal(a)metaslash.com> writes:
> For additions to the stdlib, should we try to make sure new features
> are used? In the above code, type(longopts) ... ->
> isinstance(longopts, str) (or basestring?) and all_options_first
> could be a bool.
Done. It really should be ,str), since Unicode in command line options
is not yet support (although they should be, since, on Windows,
command line options are "natively" Unicode).
How would we go about adding a canned response to the commonly submitted
"max recursion limit exceeded" bug report? I think Tim's discussion of re
design patterns to use in
(or something like it) probably belongs in the re module docs since this is
such a common stumbling block for people used to using ".*?". I'll work
something up for the Examples section and Jake's hockey game this morning.
The socket timeout patch is finally available with doc
updates n' unit test as patch #555085 in the SF tracker. This
implementation provides timeout functionality at the C
level. A patch to socket.py also fixes the problem of losing
data while an exception is thrown to the underlying socket.
For my gpg public key: