Hello, my name is Greg.
I've just started using python after many years of C programming, and
I'm also new to the list. I wanted to clarify this
first, so that maybe I will get a little less beating for my stupidity :)
I use python3 and Linux on arch x86 (production will be on x86_64,
though this shouldn't matter much).
The application that I'm presently working on is a network server. It
would use separate processes to accept the
connections, and to do the work (much like how apache prefork does). One
process accept()s on the original socket and
the received socket (client socket) will be read for a request. After
the request is received and parsed this process (the
controller) will choose one from its' children that is most capable of
handling the said request. It would then pass the
file descriptor through a socketpair to the appropriate children and go
handle the next client. All works fine and smooth,
but I realized that I need sendmsg()/recvmsg() to pass the FD. Since
these are not implemented in the python socket
module, and Linux has no other way to do this, I'm stuck. Fell flat on
my face, too :)
Browsing the net I've found a patch to the python core
(http://bugs.python.org/issue1194378), dated 2005.
First of all, I would like to ask you guys, whether you know of any way
of doing this FD passing magic, or that you know
of any 3rd party module / patch / anything that can do this for me.
Since I'm fairly familiar with C and (not that much, but I feel the
power) python, I would take the challenge of writing
it, given that the above code is still somewhat usable. If all else
fails I would like to have your help to guide me through
I have written replacements for a few of Mark Hammond's PyWin32 functions
using ctypes to call upon the Windows kernel API. Code can be found here;
http://pastebin.com/m46de01f . When calling ReadFile, it appears that the
kernel API converts '/n' newlines '/r/n' newlines. I have not been able to
find any information about this but as far as I can tell, there is nothing
in my code that is causing the problem. Initially I suspected it related to
newline translation but the function in subprocess.Popen for translating
newlines only converts to '/n' newlines. When I replaced my ReadFile and
WriteFile functions with the PyWin32 functions I was imitating, the problem
still arose. I was hoping someone could confirm this problem for me from
experience or by testing out my code. If you would like the see the
functions used in full context, I have a Mercurial Google Code project that
can be checked out at
http://code.google.com/p/subprocdev/source/list(branch "python3k"). My
work entails modifications to subprocess.py so when
running the code, please update your PYTHONPATH variable so that my
subprocess.py will be imported.
Michael Foord wrote:
>> I agree. People with versioning issues *should* be using virtualenv.
Tarek Ziadé replied (and several people later agreed)
> Let's remove site-packages from Python then.
What about those people *without* versioning issues?
I have no qualms suggesting that package management programs avoid the
use of site-packages. Such programs do need to cater to edge cases.
But a single drop-it-in directory works great for the vast majority of
*my* needs; requiring me to drink the Kool-Aid from a specific package
management system just to get access to any add-ons -- even those I
wrote myself in pure python -- would be a huge step backwards.
I've been in correspondence with Microsoft about the provision of
software, and it transpires that if you want to support Windows better
Microsoft will be quite liberal about licensing: they will *give* you a
Microsoft Developer Network license.
If you are interested in offering better Windows support then please
read the email below (note: Windows buildbot support would be a
qualifying activity) and let me have the required details. I will pass
them to Tom in bulk to simplify the processing.
Note that I'm not following python-dev right now due to pressure of
work, so PLEASE EMAIL ME DIRECTLY (or Cc me on your list replies) to
make sure I get your information.
-------- Original Message --------
Subject: RE: Support for Python: Windows Buildbots
Date: Tue, 7 Jul 2009 08:52:10 -0700
From: Tom Hanrahan
For the purposes of providing MSDN licenses to an open source
development community, I consider anyone who writes, builds, tests or
documents software to be a "developer who contributes" to the project.
(In fact, having started out as a test engineer, I would take exception
to anyone who claimed only people who write code are "developers" :-)
We do ask that requests are for people who are active contributors and
not just minor/occasional participants. [...]
Here's what we need for each request:
Email Address (the subscription will be sent here, and this will also be
used to log into the MSDN site)
Project/Company (Python Software Foundation)
Complete Mailing Address
(Postal or Zip Code)
From: Steve Holden [mailto:firstname.lastname@example.org]
Sent: Tuesday, July 07, 2009 6:01 AM
To: Tom Hanrahan
Cc: Anandeep Pannu; Pat Campbell; Python Board; Jim Hugunin
Subject: Re: Support for Python: Windows Buildbots
Further to Sam's email, in fact the original inquiry was instituted by
the need of our part-time administrator to acquire an Office license. I
am guessing she wouldn't qualify as an Open Source Developer, but that
least naturally to the more interesting question of who would.
A Bing search for "Microsoft Open Source Developer Program" didn't yield
any usable hits, so it might be helpful if you could point me to some
web resources that will help me make sense of what's available, who's
eligible and how they apply for it. I will be happy to publicize the
details to the development team.
It's true, I believe, that most of the core Python developers use Linux,
but both Tim Peters and I are primarily on the Windows platform. What's
more, with the emergence of virtualization environments having Linux on
your desktop is no hindrance to running Windows in a virtual machine (I
run Linux on virtuals when appropriate).
So let's take it from here and see where we go.
Sam Ramji wrote:
> If the PSF's Windows users are developers who contribute to Python, we can offer them MSDN subscriptions as part of the Open Source Developer program.
> MSDN subscriptions include copies of most Microsoft products (including Office and Exchange) for use while developing and testing software. For more details, check here - we provide Visual Studio Pro with MSDN Premium under this program (http://msdn.microsoft.com/en-us/subscriptions/subscriptionschart.aspx).
> Tom Hanrahan is the Director of the Open Source Technology Center, and is the sponsor of the Open Source Developer program. I've copied him here - please contact him directly with the details of the people who would like to participate. He's at hanrahat(a)microsoft.com.
> We definitely want to make Windows a friendly place for Python developers!
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Watch PyCon on video now! http://pycon.blip.tv/
I'd like to bring up the general idea of using a PSF slot in GSoC2009
to improve the Python development infrastructure. I also happen to
have two concrete proposals for that (such a coincidence!). But I
assure you the general idea is more important than my proposals :)
Solving issues that get in core devs' way when they work on Python
development could be a nice PSF GSoC project.
I believe there are enough code related tasks that would greatly
improve Python development workflow but lack manpower to complete.
E.g., ISTM that a student working on svnmerge in past SoC editions
would've been able to mitigate some painful shortcomings. If the core
developers could come up with a list of peeves that would be solvable
in code (in existing tools), that would allow for a very useful GSoC
These might fit the description above, and they're both tracker
related (yep, one-trick-pony here). The upside is that at least one
potential mentor is available for each, and I'd be able to offer
support to the student.
First, the PSF could invest a slot on integrating different tools of
the development workflow. Variations of 'file issue at bug tracker,
submit code for review' or 'branch locally to virtualenv, upload
failing testcase to tracker, upload patch to tracker' command line
utils. If these could be developed as general tools (i.e., not deeply
tied to Python dev infrastructure), Ali Afshar from PIDA is willing to
mentor it. I'd be available to help with Roundup and Rietveld
(integration in progress), but would like to see it work with
Launchpad, Bugzilla, Google Code and Review Board :)
The other proposal is to use a slot in Roundup proper and the Python
tracker instance. This could have a core Roundup developer as mentor,
under the condition it's focused on Roundup itself. IMO, are some
missing/broken core features in Roundup that would have a positive
impact on our tracker, mostly those relating to searches, security and
UI. I'd have a lot to contribute to the Python tracker part, based on
Even if neither is considered worthy, I'll keep them on my to-do list
and hope to slowly and hackishly work towards both proposals' goals.
Barring feedback saying that they're out of scope, stupid and
downright offensive, I'll post details on each to this thread very
So, if the PSF was to use a slot on improving the way you work on
Python development, what would you like to see fixed or implemented?
--- On Wed, 7/22/09, Oleg Broytmann <phd(a)phd.pp.ru> wrote:
> From: Oleg Broytmann <phd(a)phd.pp.ru>
> Subject: Re: [Python-Dev] expy: an expressway to extend Python
> To: python-dev(a)python.org
> Date: Wednesday, July 22, 2009, 12:45 AM
> On Tue, Jul 21, 2009 at 04:26:52PM
> -0400, Eric Entin wrote:
> > I think the point of his software is to make it easier
> to interface Python
> > with C code
> I think I understand that. And I think
> > > > @function(double) #return type: double
> > > > def sqrt(x=double): #argument x: double
> is how C functions are declared in
> Python, so I think annotations is the
> way to go for such declarations.
> > > Python 3.0 has arguments and return
> value annotations:
> > >
> > > http://docs.python.org/3.0/whatsnew/3.0.html#new-syntax
> > > http://www.python.org/dev/peps/pep-3107/
Thanks, I think that is a brilliant suggestion,
when expy is implemented for Python 3.0, this
will greatly improve readability, and make it
more like a natural part of Python.
Here is a brief example on how to use expy to implement the math module:
(for more details, see http://expy.sf.net/)
"""Python math module by expy-cxpy."""
from expy import *
#includes, defines, etc.
@function(double) #return type: double
def sqrt(x=double): #argument x: double
"""sqrt(x) --> the square root of x."""
return "sqrt(x)" #as an expression
#a more terse way
"""sin(x) --> the sin of x."""
pass #the deduced call: sin(x)
#more functions ...
expymodule(__name__) #end of module
In response to some rumblings on python-committers and just to request
more feedback, a progress report. I know it's long, I've tried to put
to keep it concise and chunked, though.
- First of all, I've got the basic conversion down, I've done it a few
times now, with progressively better results. You can view some
results at http://hg.python.org/, which has a preliminary cpython
repository. *** The changeset hashes for that repo will change, so you
won't be able to commit or pull from it in the future.***
- Second of all, some planning. I've thought about it a bit, and I
think we should aim for going live with hg on August 1. Given that I'm
on vacation from 8-18 July (and I'm not sure whether I'll be able to
actually work on it during that time, though I imagine I'll be able to
spend some time on it at least), that's quite ambitious, so I'm going
to say it's okay if it slips by a few days. Putting a deadline out
there is a good thing, anyway.
- Third of all, to make this possible, it would be helpful if I got
more feedback on the PEP. Last time I raised it, there was virtually
nothing. This time, I'll include it inline so there's hopefully less
of a barrier to reviewing it.
- Fourth, Mercurial 1.3 was just released! Bet you didn't see that
coming. It's looking like a pretty good release, with an experimental
version of the much-coveted subrepository support (like
svn:externals). This also means that the latest version of
hgsubversion, the tool I used for the conversion, will be more
accessible for converting other projects. You know you want to!
- Fifth, here's a list of things, off the top of my head, that still need doing:
* Get agreement on branch strategy and branch processing (list of
branches + proposed handling at
http://hg.python.org/pymigr/file/tip/all-branches.txt) <--- PLEASE
* Get agreement on tag processing (first come up with a list)
* Set up hg-ssh infra (should be easy)
* Set up hooks (should be mostly straightforward)
* Set up roundup integration (should be made easier by quick revision
map hgweb extension)
* Write docs
- Sixth (this is the good part), less obvious things that have been
done or don't need doing:
* .hgignore generation (I've been convinced it's too hard, the current
version will do)
* revlog reordering (it's painless and a big win)
I'll get through all of these myself, but obviously any help would be
welcome. For any hg users, writing docs should be an easy start. For
others, please review the PEP (below), the branch map in
http://hg.python.org/pymigr/file/tip/all-branches.txt and the author
map at http://hg.python.org/pymigr/file/tip/author-map (not that much
has changed since the start, so if you've looked at it already, feel
free to skip this part). Right now I'm a little stuck on branch
processing, because it's a long running script that needs a bunch of
debugging, but I'll get going on that again.
I think that's all I can think of for now, I'll update the PEP with
new bits soon. Here it is, ready for your review:
After having decided to switch to the Mercurial DVCS, the actual
migration still has to be performed. In the case of an important piece
of infrastructure like the version control system for a large,
distributed project like Python, this is a significant effort. This
PEP is an attempt to describe the steps that must be taken for further
discussion. It's somewhat similar to PEP 347 , which discussed the
migration to SVN.
To make the most of hg, I (Dirkjan) would like to make a high-fidelity
conversion, such that (a) as much of the svn metadata as possible is
retained, and (b) all metadata is converted to formats that are common
in Mercurial. This way, tools written for Mercurial can be optimally
used. In order to do this, I want to use the hgsubversion  software
to do an initial conversion. This hg extension is focused on providing
high-quality conversion from Subversion to Mercurial for use in
two-way correspondence, meaning it doesn't throw away as much
available metadata as other solutions.
Such a conversion also seems like a good time to reconsider the
contents of the repository and determine if some things are still
valuable. In this spirit, the following sections also propose
discarding some of the older metadata.
TBD; needs fully working hgsubversion and consensus on this document.
Mercurial has two basic ways of using branches: cloned branches, where
each branch is kept in a separate repository, and named branches,
where each revision keeps metadata to note on which branch it belongs.
The former makes it easier to distinguish branches, at the expense of
requiring more disk space on the client. The latter makes it a little
easier to switch between branches, but often has somewhat unintuitive
results for people (though this has been getting better in recent
versions of Mercurial).
I'm still a bit on the fence about whether Python should adopt cloned
branches and named branches. Since it usually makes more sense to tag
releases on the maintenance branch, for example, mainline history
would not contain release tags if we used cloned branches. Also,
Mercurial 1.2 and 1.3 have the necessary tools to make named branches
less painful (because they can be properly closed and closed heads are
no longer considered in relevant cases).
A disadvantage might be that the used clones will be a good bit larger
(since they essentially contain all other branches as well). This can
me mitigated by keeping non-release (feature) branches in separate
clones. Also note that it's still possible to clone a single named
branch from a combined clone, by specifying the branch as in hg clone
http://hg.python.org/main/#2.6-maint. Keeping the py3k history in a
separate clone problably also makes sense.
XXX To do: size comparison for selected separation scenarios.
There are quite a lot of branches in SVN's branches directory. I
propose to clean this up a bit, by employing the following the
* Keep all release (maintenance) branches
* Discard branches that haven't been touched in 18 months, unless
somone indicates there's still interest in such a branch
* Keep branches that have been touched in the last 18 months,
unless someone indicates the branch can be deprecated
The SVN tags directory contains a lot of old stuff. Some of these are
not, in fact, full tags, but contain only a smaller subset of the
repository. I think we should keep all release tags, and consider
other tags for inclusion based on requests from the developer
community. I'd like to consider unifying the release tag naming scheme
to make some things more consistent, if people feel that won't create
too many problems. For example, Mercurial itself just uses '1.2.1' as
a tag, where CPython would currently use r121.
In order to provide user names the way they are common in hg (in the
'First Last <user(a)example.org>' format), we need an author map to map
cvs and svn user names to real names and their email addresses. I have
a complete version of such a map in my migration tools repository .
The email addresses in it might be out of date; that's bound to
happen, although it would be nice to try and have as many people as
possible review it for addresses that are out of date. The current
version also still seems to contain some encoding problems.
The .hgignore file can be used in Mercurial repositories to help
ignore files that are not eligible for version control. It does this
by employing several possible forms of pattern matching. The current
Python repository already includes a rudimentary .hgignore file to
help with using the hg mirrors.
It might be useful to have the .hgignore be generated automatically
from svn:ignore properties. This would make sure all historic
revisions also have useful ignore information (though one could argue
ignoring isn't really relevant to just checking out an old revision).
As an optional optimization technique, we should consider trying a
reordering pass on the revlogs (internal Mercurial files) resulting
from the conversion. In some cases this results in dramatic decreases
in on-disk repository size.
Richard Tew has indicated that he'd like the Stackless repository to
also be converted. What other projects in the svn.python.org
repository should be converted? Do we want to convert the peps
repository? distutils? others?
Developers should access the repositories through ssh, similar to the
current setup. Public keys can be used to grant people access to a
shared hg@ account. A hgwebdir instance should also be set up for easy
browsing and read-only access. If we're using ssh, developers should
trivially be able to start new clones (for longer-term features that
profit from a separate branch).
A number of hooks is currently in use. The hg equivalents for these
should be developed and deployed. The following hooks are being used:
* check whitespace: a hook to reject commits in case the
whitespace doesn't match the rules for the Python codebase. Should be
straightforward to re-implement from the current version. We can also
offer a whitespace hook for use with client-side repositories that
people can use; it could either warn about whitespace issues and/or
truncate trailing whitespace from changed lines. Open issue: do we
check only the tip after each push, or do we check every commit in a
* commit mails: we can leverage the notify extension for this
* buildbots: both the regular and the community build masters must
be notified. Fortunately buildbot includes support for hg. I've also
implemented this for Mercurial itself, so I don't expect problems
* check contributors: in the current setup, all changesets bear
the username of committers, who must have signed the contributor
agreement. In a DVCS, the committers are not necessarily the same
people who push, and so we can't check if the committer is a
contributor. We could use a hook to check if the committer is a
contributor if we keep a list of registered contributors.
A more or less stock hgwebdir installation should be set up. We might
want to come up with a style to match the Python website. It may also
be useful to build a quick extension to augment the URL rev parser so
that it can also take r[0-9]+ args and come up with the matching hg
Where to get code
It needs to be decided where the hg repositories will live. I'd like
to propose to keep the hgwebdir instance at hg.python.org. This is an
accepted standard for many organizations, and an easy parallel to
svn.python.org. The 2.7 (trunk) repo might live at
http://hg.python.org/main/, for example, with py3k at
http://hg.python.org/py3k/. For write access, developers will have to
use ssh, which could be ssh://email@example.com/main/. A demo
installation will be set up with a preliminary conversion so people
can experiment and review; it can live at
code.python.org was also proposed as the hostname. Personally, I think
that using the VCS name in the hostname is good because it prevents
confusion: it should be clear that you can't use svn or bzr for
hgwebdir can already provide tarballs for every changeset. I think
this obviates the need for daily snapshots; we can just point users to
tip.tar.gz instead, meaning they will get the latest. If desired, we
could even use buildbot results to point to the last good changeset.
hg comes with good built-in documentation (available through hg help)
and a wiki  that's full of useful information and recipes. In
addition to that, the parts of the developer FAQ  concerning
version control will gain a section on using hg for Python
development. Some of the text will be dependent on the outcome of
debate about this PEP (for example, the branching strategy).
Think first, commit later?
In recent history, old versions of Python have been maintained by a
select group of people backporting patches from trunk to release
branches. While this may not scale so well as the development pace
grows, it also runs into some problems with the current crop of
distributed versioning tools. These tools (I believe similar problems
would exist for either git, bzr, or hg, though some may cope better
than others) are based on the idea of a Directed Acyclic Graph (or
DAG), meaning they keep track of relations of changesets.
Mercurial itself has a stable branch which is a ''strict'' subset of
the unstable branch. This means that generally all fixes for the
stable branch get committed against the tip of the stable branch, then
they get merged into the unstable branch (which already contains the
parent of the new cset). This provides a largely frictionless
environment for moving changes from stable to unstable branches.
Mistakes, where a change that should go on stable goes on unstable
first, do happen, but they're usually easy to fix. That can be done by
copying the change over to the stable branch, then trivial-merging
with unstable -- meaning the merge in fact ignores the parent from the
This strategy means a little more work for regular committers, because
they have to think about whether their change should go on stable or
unstable; they may even have to ask someone else (the RM) before
committing. But it also relieves a dedicated group of committers of
regular backporting duty, in addition to making it easier to work with
Now would be a good time to consider changing strategies in this
regard, although it would be relatively easy to switch to such a model
The future of Subversion
What happens to the Subversion repositories after the migration? Since
the svn server contains a bunch of repositories, not just the CPython
one, it will probably live on for a bit as not every project may want
to migrate or it takes longer for other projects to migrate. To
prevent people from staying behind, we may want to remove migrated
projects from the repository.
Python currently provides the sys.subversion tuple to allow Python
code to find out exactly what version of Python it's running against.
The current version looks something like this:
* ('CPython', 'tags/r262', '71600')
* ('CPython', 'trunk', '73128M')
Another value is returned from Py_GetBuildInfo() in the C API, and
available to Python code as part of sys.version:
* 'r262:71600, Jun 2 2009, 09:58:33'
* 'trunk:73128M, Jun 2 2009, 01:24:14'
I propose that the revision identifier will be the short version of
hg's revision hash, for example 'dd3ebf81af43', augmented with '+'
(instead of 'M') if the working directory from which it was built was
modified. This mirrors the output of the hg id command, which is
intended for this kind of usage.
For the tag/branch identifier, I propose that hg will check for tags
on the currently checked out revision, use the tag if there is one
('tip' doesn't count), and uses the branch name otherwise.
* ('CPython', '2.6.2', 'dd3ebf81af43')
* ('CPython', 'default', 'af694c6a888c+')
and the build info string becomes
* '2.6.2:dd3ebf81af43, Jun 2 2009, 09:58:33'
* 'default:af694c6a888c+, Jun 2 2009, 01:24:14'
This reflects that the default branch in hg is called 'default'
instead of Subversion's 'trunk', and reflects the proposed new tag
I'm mailing this to python-dev because I'd like feedback on the idea of
adding an "re" attribute to strings. I'm not sure if it's a good idea or
not yet, but I figure it's worth discussion. The module mentioned here
includes a class called "restr()" which allows you to play with "s.re".
As some of you may recall, I'm not particularly fond of the recipe:
m = re.match(r'whatever(.*)', s)
The other morning I came up on the idea of adding an "re" to strings, so
you could do things like:
if (date.re.match(r'(?P<year>\d\d\d\d)-(?P<month>\d\d)' or
So I decided to try experimenting with it and see how I like it. I've also
thrown a bunch of other stuff into it and made a module called
As the version number is meant to indicate, this is something that I'm
still exploring whether it is the right thing done in the right way.
Though at the moment the only thing I plan to change is that some of the
iterators (having nothing to do with adding "re" to string objects)
probably shouldn't consume the "barrier" such as the "dropwhile()" and
"takewhile()". You might want to do something like:
fp = filtertools.reopen('mailbox')
for header in filtertools.takewhile([ r'^\S' ], fp.readlines()) :
print 'HEADER:', header.rstrip()
for continued in filtertools.takewhile([ r'^\s+\S' ], fp.readlines()) :
print 'CONTINUED:', continued.rstrip()
But, the "takewhile()" I will consume the first non-matching line.
Anyway, I appreciate any feedback folks have.
What we see depends on mainly what we look for.
-- John Lubbock
Sean Reifschneider, Member of Technical Staff <jafo(a)tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability