
Hello, I would like to suggest that we remove the socket HOWTO (currently at http://docs.python.org/dev/howto/sockets.html) My main issue with this document is that it doesn't seem to have a well-defined destination: - people who know sockets won't learn anything from it - but people who don't know sockets will probably find it clear as mud (for example, what's an "INET" or "STREAM" socket? what's "select"?) I have other issues, such as the style/tone it's written in. I'm sure the author had fun writing it but it doesn't fit well with the rest of the documentation. Also, the author gives a lot of "advice" without explaining or justifying it ("if somewhere in those input lists of sockets is one which has died a nasty death, the select will fail" -> is that really true? what is a "nasty death" and how is that supposed to happen? couldn't the author have put a 3-line example to demonstrate this supposed drawback and how it manifests?). And, finally, many statements seem arbitrary ("There’s no question that the fastest sockets code uses non-blocking sockets and select to multiplex them") or plain wrong ("threading support in Unixes varies both in API and quality. So the normal Unix solution is to fork a subprocess to deal with each connection"). I don't think giving misleading advice to users is really a good idea. And suggesting beginners they use non-blocking sockets without even *showing* how (or pointing to asyncore or Twisted) is a very bad idea. select() is not enough, you still have to be prepared to get EAGAIN or EWOULDBLOCK when calling recv() or send() (i.e. select() can give false positives). Oh and I think it's obsolete too, because the "class mysocket" concatenates the output of recv() with a str rather than a bytes object. Not to mention that features of the "class mysocket" can be had using a buffered socket.makefile() instead of writing custom code. (followed up from http://bugs.python.org/issue12126 at Eli's request) Regards Antoine.

On Sat, May 21, 2011 at 05:37:05PM +0200, Georg Brandl wrote:
I favor a rewrite over removal. I have read it once/twice and have never revisited it (the probably the reason that it was not helpful enough for a revisit), but still gives some important pointers. One document cannot cover it all, there are many pointers (examples at effbot.org, Python MoTW docs) all serve as good introduction to sockets in python. So a rewrite with good pointers would be more appropriate. -- Senthil

On Sun, May 22, 2011 at 3:38 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Perhaps replacing it with a placeholder page that refers to the Wiki would be appropriate? A simple summary saying that the HOWTO had not aged well, and hence had been removed from the official documentation until it had been updated on the Wiki would allow people looking for it to better understand the situation, and also how to help improve it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, May 22, 2011 at 11:22 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
+1 on removal. +0.8 on the pointer with a disclaimer (please also add the disclaimer at the top of the socket howto as well). there's a lot of editorial misinformation in that page even if some parts of it are useful for the socket unaware... -gps

<snip> I definitely recall finding this document useful when I first learned Python. I knew socket programming from other languages, and the document helped to see how it maps to Python. That said, I must agree that there is probably no place for such a tutorial in Python's official documentation. Python is a widely-general purpose language, and sockets programming is just one of a plethora of things it supports, so a special treatment for sockets probably isn't warranted, especially given that the `socket` module itself is a relatively thin wrapper over the OS socket interface. I don't think a rewrite will help either. To describe socket programming in full, without missing anything and being accurate will require no less than a small book (and in fact many such books already exist). Therefore, I'm +1 on removing it from the official docs. It can be relegated to the Python wiki, where it can be improved if someone wishes to contribute to that. Eli

On Sat, 2011-05-21 at 17:07 +0200, Antoine Pitrou wrote:
While I agree with most of what you said, I actually did find it very useful when first learning sockets. It's in the top page on google for "socket programming" or "socket how to". Also, it hinted at some concepts that could then be googled for more information like select, nonblocking sockets, etc. However, I would agree that this should be moved out of the documentation and as suggested in the issue, into the wiki.

I would like to suggest that we remove the socket HOWTO (currently at http://docs.python.org/dev/howto/sockets.html)
-1. I think there should be a Python-oriented introduction to sockets. You may have complaints about the specific wording of the text, but please understand that these are probably irrelevant to most first-time readers of this text. My observation is that people actually don't read the text that much, but instead try to imitate the examples. So if the examples are good (and I think they are, mostly), it's of minor relevance whether the text makes all sense the first time.
- people who know sockets won't learn anything from it
True. People who know sockets just need to read the module documentation. It is a beauty of the Python library design that it exposes the API mostly as-is, so if you know Berkeley sockets, you will be immediately familiar with Python sockets (unlike, say, Java or .NET, where they decided to regroup the API into classes).
- but people who don't know sockets will probably find it clear as mud
See above - it doesn't really matter.
(for example, what's an "INET" or "STREAM" socket?
You are probably referring to the sentence "I’m only going to talk about INET sockets, but they account for at least 99% of the sockets in use. And I’ll only talk about STREAM sockets" here. It's not important to first-time readers to actually understand that, and the wording explicitly tells them that they don't need to understand. It says "there is more stuff, and you won't need it, and the stuff you need is called INET and STREAM". It's easy to fix, though, and I fixed it in f70e26452621 (explaining that this is all about TCPv4).
what's "select"?)
It's well explained in the section Non-blocking Sockets, isn't it?
It's a HOWTO - of course it has advise without justification. It's not a reference documentation which only tells you what it does, but not what the best way of putting it together is.
I think it is: py> import select py> select.select([100],[],[],0) Traceback (most recent call last): File "<stdin>", line 1, in <module> select.error: (9, 'Bad file descriptor') Of course, rather than "has died a nasty death", it could also say "has been closed".
It may well be that the author didn't fully understand the problem when writing the text, so I wouldn't mind removing this specific paragraph.
I'd evaluate these two statements exactly vice versa. The first one (non-blocking sockets are faster) is plain wrong, and the second one ("threading support in Unix varies") is arbitrary, but factually correct :-) I'd drop the entire "Performance" section - there is much more to be said about socket performance than a few paragraphs of text, and for the target audience, performance is probably no concern.
That's easy to fix, too - c65e1a422bc3
Not to mention that features of the "class mysocket" can be had using a buffered socket.makefile() instead of writing custom code.
I find it actually appropriate in the context. It illustrates a number of important points about sockets, namely that you cannot rely on send() and recv() to match in block size. Ultimately, people that use the socket API *really* need to understand TCP, so it's good to explain to them that there are issues to consider right in the first tutorial. Regards, Martin

Hello, On Sun, 29 May 2011 17:20:29 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
So what you're saying is that the text is mostly useless (or at least quite dispensable), but you think it's fine that people waste their time trying to read it? Some of the people reading our docs are not fluent English readers, and it can be quite an effort for them to read some big chunk of text which will be ultimately pointless.
So if the examples are good (and I think they are, mostly), it's of minor relevance whether the text makes all sense the first time.
I'm not sure why the examples are good (for example, modern client code should probably use create_connection() with a host name, not connect()). Also, really, to socket beginners, I think the primary advice should be: first try to find some high-level library that does the dirty work for you (for example some protocol-specific lib on the client side, or something like Twisted or asyncore on the server side). Not "hey, here's how you write a threaded server in 4 lines of code, and wow, look, you can also write non-blocking code using select() too!".
Well... in a couple of months, someone will tell them their code doesn't support IPv6 and they'll be lost.
what's "select"?)
It's well explained in the section Non-blocking Sockets, isn't it?
I don't think it explains well how a non-blocking socket works. It's very opinionated and has little useful technical content. EAGAIN and EWOULDBLOCK are not even mentioned!
It's a HOWTO - of course it has advise without justification.
Well, I think that's bad. When we give advice to users, we should explain the motivation of the advice given. Otherwise we convey the impression that there's some magic that people shouldn't try to understand.
+1. When reading it I get the idea that the OS might kill sockets in my back, while in reality the only way a EBADF can happen is if I explicitly close the socket - i.e. a programming error on my part.
+1 :) Thank you Antoine.

No, that's not what I said. I said the people actually *don't* read the text, so they won't waste time with it. They only glance at the text, enough to understand the examples.
You completely misunderstood. I didn't say that the reading the text is pointless. I said that people don't read this text, nor any software documentation, in particular when they are not fluent in English.
I disagree. create_connection is an advanced function - you shouldn't be using it unless you know what it is doing. As a socket tutorial, people do have to know about connect.
No no no no no. Absolutely not. a) telling people who want to learn sockets "don't learn sockets, learn some higher-level library" is besides the point. What do you tell them when they did that, and now come back to learn sockets? b) telling people to use Twisted or asyncore on the server side if they are new to sockets is bad advice. People *first* have to understand sockets, and *then* can use these libraries and frameworks. Those libraries aren't made to be black boxes that work even if you don't know how - you *have* to know how they work inside, or else you can't productively use them.
I'd happily kill the entire non-blocking discussion from the tutorial if it hurts you as much as it hurts me. I wish this non-blocking stuff never went into Python.
Well... in a couple of months, someone will tell them their code doesn't support IPv6 and they'll be lost.
No. In a couple of months, they'll understand sockets much better, so they'll be able to support IPv6 easily.
It's not that bad. Please reconsider. People do get a lot of advise in their lives that isn't motivated down to the root cause, and accept advise from authority. Only if they understand what it does, they ask why. Regards, Martin

On Sun, 05 Jun 2011 08:32:38 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
I'm sorry, that sounds like a very outlandish argument to make. Did you run a user survey? If people only "glance at the text", then what is the text for? Why not kill the text and rename the page "socket examples" so that there is no misunderstanding and so that we don't waste time trying to maintain (and argue about) it?
Can you explain? I would certainly use it myself, and I don't understand how it's "advanced". It's simply higher-level. Actually, we've been actually replacing uses of connect() with create_connection() in various parts of the stdlib, so that our client modules get IPv6-compatible.
You said yourself that the HOWTO doesn't claim to explain sockets, so how can you make such a point now? If people want to learn sockets for real, the HOWTO is hopeless for them.
I'd happily kill the entire non-blocking discussion from the tutorial if it hurts you as much as it hurts me.
+1. Regards Antoine.

It uses getaddrinfo, which might return multiple addresses, which are then tried in sequence. So even though it's called "create_connection", it may actually attempt to create multiple connections. As a consequence, it may wait some time for one connection to complete, and then succeed on a different address. These phenomena can only be understood when you know what it is actually doing.
And that's fine - the people making this changes most certainly where capable of using advanced API.
Did I say that? If so, I didn't mean to. It explains how to use the socket API. Regards, Martin

Antoine Pitrou writes:
Did you run a user survey?
Martin undoubtedly has a lot of experience with users, and it's quite reasonable for him to express his opinions based on that informal sample, yes. The issue here is the difference between existential and universal quantifiers. Martin's arguments are not inconsistent. They simply acknowledge the existence of subsamples of users of the same document with different needs and/or approaches to reading the document. He does not and has never claimed that all of his arguments apply to all of the potential readers. You might question whether the same document should serve both the "cargo cult the examples" group and the "read the fine print" group. That's a valid question, but here my feeling is that the answer is "yes". I very often "cargo cult" my first program, then go back to the fine print and experiment by gradually changing that program to test my understanding of the detailed explanations. It is often easiest to use the same document for both purposes because I already know where it is and the quality of the writing.

On Mon, 06 Jun 2011 10:33:14 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
The point here is that the examples in that document are very poor (the only substantial example actually duplicates existing functionality - in a sub-optimal manner - without even mentioning the existence of said functionality), and the technical explanations are nearly non-existent. So I'll happy stand by my claims. The Python documentation isn't meant to host any potentially helpful document, however flawed. We have the Internet for that. Regards Antoine.

Antoine Pitrou wrote:
You know, for the amount of discussion about whether or not the doc is worth keeping, we probably could have fixed all the problems with it :) I believe that "status quo wins" is worth applying here. In the absence of evidence that the HOWTO is actively harmful, we should keep it. I'm of two minds whether it should go into the wiki. I would hate for the wiki to become the place where bad docs go to die, but on the other hand putting it in the wiki may encourage lightweight incremental fixes. I think the Socket HOWTO is important enough to fix, not throw out. I also dislike link-rot, and throwing it out causes link-rot. I'd rather see a bunch of concrete bug reports for the HOWTO than just a dismissive "throw it out and start again".
The Python documentation isn't meant to host any potentially helpful document, however flawed. We have the Internet for that.
I think it is unfair to dismiss the document as "potentially" helpful when a number of people have said that it *actually* did help them. -- Steven

Antoine Pitrou writes:
So did you read the discussion before posting?
Yes. It's rude to assume that those who disagree with you are irresponsible and uninformed. Would you please stop it?
The sockets HOWTO *doesn't* serve both groups. Actually, I would argue that it serves neither of them.
I know that is your opinion, because I've read your posts. I disagree. The document is imperfect, but for me it served a certain purpose. Therefore I stand with the camp that says improving the document is the way to go.

+1. I've been reading the postings on this discussion intently, as I have had experience with the socket HOWTO when I was first learning about sockets. I agree with the view that Martin v. Löwis expressed, that as a beginner I didn't read too much into the text at first because I was more concerned with trying out the examples and getting used to writing the code and such. I would also say that, I wasn't too bothered if the guide never went into too much detail about all the terms it was mentioning, it isn't after all a definitive guide on sockets, but the terms can always be googled later if one so wished. I wholeheartedly disagree with removing it, that would be a real shame and I dislike the idea of moving it to the wiki (I cannot even remember ever visiting the wiki). I may not be a Python Guru but I think my "n00bishness" helps in this particular discussion and I would say I would have to agree to an improvement over the suggested alternatives. Craig

Wiadomość napisana przez C McL w dniu 2011-06-07, o godz. 00:15:
I cannot even remember ever visiting the wiki.
FWIW neither can I. The Wiki link on the front page is below Jobs and Merchandise so it's easy to miss it altogether ;-) -- Best regards, Łukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o.

In particular, this is collected experience from interaction with students learning Python, or other languages. When they try to solve a problem, they don't read specification-style documentation. Instead they look for examples that they can imitate. [I notice that you (Stephen) also confirmed this from your own experience]
Exactly so. I'd like to settle this discussion based on the anecdotal report of several users on this list that they considered the tutorial useful.
In that spirit, I'd be in favor of removing outright errors from the document, and overly subjective and argumentative passages. Other than that, I still think its fine as it stands. Regards, Martin

On Jun 4, 2011, at 11:32 PM, Martin v. Löwis wrote:
First, Twisted doesn't always use the BSD sockets API; the Windows IOCP reactor, especially, starts off with the socket() function, but things go off in a different direction pretty quickly from there. So it's perfectly fine to introduce yourself to networking via Twisted, and many users have done just that. If you're using it idiomatically, you should never encounter a socket object or file descriptor poking through the API anywhere. Asyncore is different: you do need to know how sockets work in order to use it, because you're expected to call .send() and .recv() yourself. (And, in my opinion, this is a serious design flaw, for reasons which will hopefully be elucidated in the PEP that Laurens is now writing.) Second, it makes me a little sad that it appears to be folk wisdom that Twisted is only for servers. A lot of work has gone into making it equally appropriate for clients. This is especially true if your client has a GUI, where Twisted is often better than a protocol-specific library, which may either be blocking or have its own ad-hoc event loop. I don't have an opinion on the socket HOWTO per se, only on the possibility of linking to Twisted as an alternate implementation mechanism. It really would be better to say "go use Twisted rather than reading any of the following" than "read the following, which will help you understand Twisted".

Hmm. Are you saying it doesn't use listen, connect, bind, send, recv? To me, that's the core of BSD sockets. I can understand it doesn't use select(2).
And that's all fine. I still claim that you have to *understand* sockets in order to use it properly. By this, I mean stuff like "what is a TCP connection? how is it established?", "how is UDP different from TCP?", "when data arrives, what layers of software does it go through?", "what is a port number?", etc.
I think that's because many of the problems that Twisted solves don't exist in many of the client applications - in particular, you often don't have many simultaneous connections. GUI apps may be the interesting special case, but I expect that people dealing with these rather use separate threads.
Wouldn't you agree that Twisted is very difficult to learn, and thus much heavier than sockets? And I don't blame the Twisted API for that, but rather the mental model of overlapping activities that people have severe problems with. Regards, Martin

On 5 Jun, 10:35 pm, martin@v.loewis.de wrote:
Yes, that's correct. Those aren't the best APIs to use on Windows, so they aren't necessarily used on Windows.
These may be good things to understand. The current socket howto doesn't explain them, though.
On the contrary, many of the problems do exist in client applications (every time I have to use virt-manager I want to throw it out a window). Some people certainly would rather use threading, but that doesn't say anything about whether Twisted solves problems relevant to clients, only about the fact that a lot of people like to use threads.
This discussion has a significant problem, in taking "Twisted" as a monolithic all-or-nothing entity. Restricting the scope to merely the lowest-level socket replacement APIs - ie, the bare TCP, UDP, etc functionality - no, Twisted is not very difficult to learn. Expanding the scope to include the higher level functionality, it is much easier to learn than reimplementing line parsing and concurrency and so forth. However, does that really have anything to do with improving the socket howto? The Python documentation can include a clear explanation of what functionality the socket module provides - without forcing you to read Stevens _or_ use Twisted, but it can still refer you to both, since it is very likely that you'll need at least one of them in addition to the socket module. Jean-Paul

On Jun 5, 2011, at 3:35 PM, Martin v. Löwis wrote:
Yes, these are all excellent concepts to be familiar with. But the word "socket" (and the socket HOWTO) refers to a specific way to interface with those concepts, the Berkeley socket API: <http://en.wikipedia.org/wiki/Berkeley_sockets>. Which you don't have to know anything about if you're going to use Twisted. You should know about IPC in general, and TCP/UDP specifically if you're going to use Twisted, but sockets are completely optional. Also, I feel that I should point out that the sockets HOWTO does not cover even a single one of these concepts in any useful depth. If you think that these are what it should be explaining, it needs some heavy editing. Here's what it has to say about each one:
what is a TCP connection?
The only place that the characters "TCP" appear in the entire document is in the phrase "... which is completely different from TCP_NODELAY ...". Nowhere is a TCP connection explained at a conceptual level, except to say that it's something a web browser does.
how is UDP different from TCP?
The phrase "UDP" never appears in the HOWTO. DGRAM sockets get a brief mention as "anything else" in the sentence: "... you’ll get better behavior and performance from a STREAM socket than anything else ...". (To be fair, I do endorse teaching that "the difference between TCP and UDP is that you should not use UDP" to anyone not sufficiently advanced to read the relevant reference documentation themselves.)
when data arrives, what layers of software does it go through?
There's no discussion of this that I can find at all.
what is a port number?
Aside from a few comments in the code examples, the only discussion of port numbers is "low number ports are usually reserved for “well known” services (HTTP, SNMP etc)." It would be very good to have a "Python networking overview" somewhere that explained this stuff at a very high level, and described how data might get into or out of your program, with links to things like the socket HOWTO that describe more specific techniques. This would be useful because most commonly, I think that data will get into Python network programs via WSGI, not direct sockets or anything like Twisted. To be clear, having read it now: I do _not_ agree with Antoine that this document should be deleted. I dimly recall that it helped me understand some things in the very early days of Twisted. While it's far from perfect, it might help someone in a similar situation understand those things as well today. I just found it interesting that the main concepts one would associate with such a HOWTO are nowhere to be found :). -glyph

<snip>
Just be careful not to reproduce http://www.apress.com/9781590593714 :-) These things tend to get out of hand very quickly. Eli

On Wed, Jun 8, 2011 at 3:37 AM, Eli Bendersky <eliben@gmail.com> wrote:
Just be careful not to reproduce http://www.apress.com/9781590593714 :-) These things tend to get out of hand very quickly.
At the level Glyph and Martin are talking about, you're more likely to end up with http://authors.phptr.com/tanenbaumcn4/ :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Jun 7, 2011 at 10:37 AM, Eli Bendersky <eliben@gmail.com> wrote:
Just be careful not to reproduce http://www.apress.com/9781590593714 :-) These things tend to get out of hand very quickly.
You say that like it's a bad thing. The first few chapters of that would make a great replacement for the howto. Geremy Condra

On Wed, Jun 8, 2011 at 21:07, geremy condra <debatem1@gmail.com> wrote:
Not a bad thing at all, and I'm sorry if I made it sound that way. I just meant that it may turn into a *whole book* if too many details are added. I had no intention to criticize this specific book. Frankly I didn't even read it, I just remembered that a book with this title came out recently. Eli

On Tue, Jun 7, 2011 at 1:54 PM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:
And if UDP starts sounding tempting due to excessively high latency, these days it's worth looking up the specs for the interplanetary internet instead. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, May 21, 2011 at 05:37:05PM +0200, Georg Brandl wrote:
I favor a rewrite over removal. I have read it once/twice and have never revisited it (the probably the reason that it was not helpful enough for a revisit), but still gives some important pointers. One document cannot cover it all, there are many pointers (examples at effbot.org, Python MoTW docs) all serve as good introduction to sockets in python. So a rewrite with good pointers would be more appropriate. -- Senthil

On Sun, May 22, 2011 at 3:38 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Perhaps replacing it with a placeholder page that refers to the Wiki would be appropriate? A simple summary saying that the HOWTO had not aged well, and hence had been removed from the official documentation until it had been updated on the Wiki would allow people looking for it to better understand the situation, and also how to help improve it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, May 22, 2011 at 11:22 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
+1 on removal. +0.8 on the pointer with a disclaimer (please also add the disclaimer at the top of the socket howto as well). there's a lot of editorial misinformation in that page even if some parts of it are useful for the socket unaware... -gps

<snip> I definitely recall finding this document useful when I first learned Python. I knew socket programming from other languages, and the document helped to see how it maps to Python. That said, I must agree that there is probably no place for such a tutorial in Python's official documentation. Python is a widely-general purpose language, and sockets programming is just one of a plethora of things it supports, so a special treatment for sockets probably isn't warranted, especially given that the `socket` module itself is a relatively thin wrapper over the OS socket interface. I don't think a rewrite will help either. To describe socket programming in full, without missing anything and being accurate will require no less than a small book (and in fact many such books already exist). Therefore, I'm +1 on removing it from the official docs. It can be relegated to the Python wiki, where it can be improved if someone wishes to contribute to that. Eli

On Sat, 2011-05-21 at 17:07 +0200, Antoine Pitrou wrote:
While I agree with most of what you said, I actually did find it very useful when first learning sockets. It's in the top page on google for "socket programming" or "socket how to". Also, it hinted at some concepts that could then be googled for more information like select, nonblocking sockets, etc. However, I would agree that this should be moved out of the documentation and as suggested in the issue, into the wiki.

I would like to suggest that we remove the socket HOWTO (currently at http://docs.python.org/dev/howto/sockets.html)
-1. I think there should be a Python-oriented introduction to sockets. You may have complaints about the specific wording of the text, but please understand that these are probably irrelevant to most first-time readers of this text. My observation is that people actually don't read the text that much, but instead try to imitate the examples. So if the examples are good (and I think they are, mostly), it's of minor relevance whether the text makes all sense the first time.
- people who know sockets won't learn anything from it
True. People who know sockets just need to read the module documentation. It is a beauty of the Python library design that it exposes the API mostly as-is, so if you know Berkeley sockets, you will be immediately familiar with Python sockets (unlike, say, Java or .NET, where they decided to regroup the API into classes).
- but people who don't know sockets will probably find it clear as mud
See above - it doesn't really matter.
(for example, what's an "INET" or "STREAM" socket?
You are probably referring to the sentence "I’m only going to talk about INET sockets, but they account for at least 99% of the sockets in use. And I’ll only talk about STREAM sockets" here. It's not important to first-time readers to actually understand that, and the wording explicitly tells them that they don't need to understand. It says "there is more stuff, and you won't need it, and the stuff you need is called INET and STREAM". It's easy to fix, though, and I fixed it in f70e26452621 (explaining that this is all about TCPv4).
what's "select"?)
It's well explained in the section Non-blocking Sockets, isn't it?
It's a HOWTO - of course it has advise without justification. It's not a reference documentation which only tells you what it does, but not what the best way of putting it together is.
I think it is: py> import select py> select.select([100],[],[],0) Traceback (most recent call last): File "<stdin>", line 1, in <module> select.error: (9, 'Bad file descriptor') Of course, rather than "has died a nasty death", it could also say "has been closed".
It may well be that the author didn't fully understand the problem when writing the text, so I wouldn't mind removing this specific paragraph.
I'd evaluate these two statements exactly vice versa. The first one (non-blocking sockets are faster) is plain wrong, and the second one ("threading support in Unix varies") is arbitrary, but factually correct :-) I'd drop the entire "Performance" section - there is much more to be said about socket performance than a few paragraphs of text, and for the target audience, performance is probably no concern.
That's easy to fix, too - c65e1a422bc3
Not to mention that features of the "class mysocket" can be had using a buffered socket.makefile() instead of writing custom code.
I find it actually appropriate in the context. It illustrates a number of important points about sockets, namely that you cannot rely on send() and recv() to match in block size. Ultimately, people that use the socket API *really* need to understand TCP, so it's good to explain to them that there are issues to consider right in the first tutorial. Regards, Martin

Hello, On Sun, 29 May 2011 17:20:29 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
So what you're saying is that the text is mostly useless (or at least quite dispensable), but you think it's fine that people waste their time trying to read it? Some of the people reading our docs are not fluent English readers, and it can be quite an effort for them to read some big chunk of text which will be ultimately pointless.
So if the examples are good (and I think they are, mostly), it's of minor relevance whether the text makes all sense the first time.
I'm not sure why the examples are good (for example, modern client code should probably use create_connection() with a host name, not connect()). Also, really, to socket beginners, I think the primary advice should be: first try to find some high-level library that does the dirty work for you (for example some protocol-specific lib on the client side, or something like Twisted or asyncore on the server side). Not "hey, here's how you write a threaded server in 4 lines of code, and wow, look, you can also write non-blocking code using select() too!".
Well... in a couple of months, someone will tell them their code doesn't support IPv6 and they'll be lost.
what's "select"?)
It's well explained in the section Non-blocking Sockets, isn't it?
I don't think it explains well how a non-blocking socket works. It's very opinionated and has little useful technical content. EAGAIN and EWOULDBLOCK are not even mentioned!
It's a HOWTO - of course it has advise without justification.
Well, I think that's bad. When we give advice to users, we should explain the motivation of the advice given. Otherwise we convey the impression that there's some magic that people shouldn't try to understand.
+1. When reading it I get the idea that the OS might kill sockets in my back, while in reality the only way a EBADF can happen is if I explicitly close the socket - i.e. a programming error on my part.
+1 :) Thank you Antoine.

No, that's not what I said. I said the people actually *don't* read the text, so they won't waste time with it. They only glance at the text, enough to understand the examples.
You completely misunderstood. I didn't say that the reading the text is pointless. I said that people don't read this text, nor any software documentation, in particular when they are not fluent in English.
I disagree. create_connection is an advanced function - you shouldn't be using it unless you know what it is doing. As a socket tutorial, people do have to know about connect.
No no no no no. Absolutely not. a) telling people who want to learn sockets "don't learn sockets, learn some higher-level library" is besides the point. What do you tell them when they did that, and now come back to learn sockets? b) telling people to use Twisted or asyncore on the server side if they are new to sockets is bad advice. People *first* have to understand sockets, and *then* can use these libraries and frameworks. Those libraries aren't made to be black boxes that work even if you don't know how - you *have* to know how they work inside, or else you can't productively use them.
I'd happily kill the entire non-blocking discussion from the tutorial if it hurts you as much as it hurts me. I wish this non-blocking stuff never went into Python.
Well... in a couple of months, someone will tell them their code doesn't support IPv6 and they'll be lost.
No. In a couple of months, they'll understand sockets much better, so they'll be able to support IPv6 easily.
It's not that bad. Please reconsider. People do get a lot of advise in their lives that isn't motivated down to the root cause, and accept advise from authority. Only if they understand what it does, they ask why. Regards, Martin

On Sun, 05 Jun 2011 08:32:38 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
I'm sorry, that sounds like a very outlandish argument to make. Did you run a user survey? If people only "glance at the text", then what is the text for? Why not kill the text and rename the page "socket examples" so that there is no misunderstanding and so that we don't waste time trying to maintain (and argue about) it?
Can you explain? I would certainly use it myself, and I don't understand how it's "advanced". It's simply higher-level. Actually, we've been actually replacing uses of connect() with create_connection() in various parts of the stdlib, so that our client modules get IPv6-compatible.
You said yourself that the HOWTO doesn't claim to explain sockets, so how can you make such a point now? If people want to learn sockets for real, the HOWTO is hopeless for them.
I'd happily kill the entire non-blocking discussion from the tutorial if it hurts you as much as it hurts me.
+1. Regards Antoine.

It uses getaddrinfo, which might return multiple addresses, which are then tried in sequence. So even though it's called "create_connection", it may actually attempt to create multiple connections. As a consequence, it may wait some time for one connection to complete, and then succeed on a different address. These phenomena can only be understood when you know what it is actually doing.
And that's fine - the people making this changes most certainly where capable of using advanced API.
Did I say that? If so, I didn't mean to. It explains how to use the socket API. Regards, Martin

Antoine Pitrou writes:
Did you run a user survey?
Martin undoubtedly has a lot of experience with users, and it's quite reasonable for him to express his opinions based on that informal sample, yes. The issue here is the difference between existential and universal quantifiers. Martin's arguments are not inconsistent. They simply acknowledge the existence of subsamples of users of the same document with different needs and/or approaches to reading the document. He does not and has never claimed that all of his arguments apply to all of the potential readers. You might question whether the same document should serve both the "cargo cult the examples" group and the "read the fine print" group. That's a valid question, but here my feeling is that the answer is "yes". I very often "cargo cult" my first program, then go back to the fine print and experiment by gradually changing that program to test my understanding of the detailed explanations. It is often easiest to use the same document for both purposes because I already know where it is and the quality of the writing.

On Mon, 06 Jun 2011 10:33:14 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
The point here is that the examples in that document are very poor (the only substantial example actually duplicates existing functionality - in a sub-optimal manner - without even mentioning the existence of said functionality), and the technical explanations are nearly non-existent. So I'll happy stand by my claims. The Python documentation isn't meant to host any potentially helpful document, however flawed. We have the Internet for that. Regards Antoine.

Antoine Pitrou wrote:
You know, for the amount of discussion about whether or not the doc is worth keeping, we probably could have fixed all the problems with it :) I believe that "status quo wins" is worth applying here. In the absence of evidence that the HOWTO is actively harmful, we should keep it. I'm of two minds whether it should go into the wiki. I would hate for the wiki to become the place where bad docs go to die, but on the other hand putting it in the wiki may encourage lightweight incremental fixes. I think the Socket HOWTO is important enough to fix, not throw out. I also dislike link-rot, and throwing it out causes link-rot. I'd rather see a bunch of concrete bug reports for the HOWTO than just a dismissive "throw it out and start again".
The Python documentation isn't meant to host any potentially helpful document, however flawed. We have the Internet for that.
I think it is unfair to dismiss the document as "potentially" helpful when a number of people have said that it *actually* did help them. -- Steven

Antoine Pitrou writes:
So did you read the discussion before posting?
Yes. It's rude to assume that those who disagree with you are irresponsible and uninformed. Would you please stop it?
The sockets HOWTO *doesn't* serve both groups. Actually, I would argue that it serves neither of them.
I know that is your opinion, because I've read your posts. I disagree. The document is imperfect, but for me it served a certain purpose. Therefore I stand with the camp that says improving the document is the way to go.

+1. I've been reading the postings on this discussion intently, as I have had experience with the socket HOWTO when I was first learning about sockets. I agree with the view that Martin v. Löwis expressed, that as a beginner I didn't read too much into the text at first because I was more concerned with trying out the examples and getting used to writing the code and such. I would also say that, I wasn't too bothered if the guide never went into too much detail about all the terms it was mentioning, it isn't after all a definitive guide on sockets, but the terms can always be googled later if one so wished. I wholeheartedly disagree with removing it, that would be a real shame and I dislike the idea of moving it to the wiki (I cannot even remember ever visiting the wiki). I may not be a Python Guru but I think my "n00bishness" helps in this particular discussion and I would say I would have to agree to an improvement over the suggested alternatives. Craig

Wiadomość napisana przez C McL w dniu 2011-06-07, o godz. 00:15:
I cannot even remember ever visiting the wiki.
FWIW neither can I. The Wiki link on the front page is below Jobs and Merchandise so it's easy to miss it altogether ;-) -- Best regards, Łukasz Langa Senior Systems Architecture Engineer IT Infrastructure Department Grupa Allegro Sp. z o.o.

In particular, this is collected experience from interaction with students learning Python, or other languages. When they try to solve a problem, they don't read specification-style documentation. Instead they look for examples that they can imitate. [I notice that you (Stephen) also confirmed this from your own experience]
Exactly so. I'd like to settle this discussion based on the anecdotal report of several users on this list that they considered the tutorial useful.
In that spirit, I'd be in favor of removing outright errors from the document, and overly subjective and argumentative passages. Other than that, I still think its fine as it stands. Regards, Martin

On Jun 4, 2011, at 11:32 PM, Martin v. Löwis wrote:
First, Twisted doesn't always use the BSD sockets API; the Windows IOCP reactor, especially, starts off with the socket() function, but things go off in a different direction pretty quickly from there. So it's perfectly fine to introduce yourself to networking via Twisted, and many users have done just that. If you're using it idiomatically, you should never encounter a socket object or file descriptor poking through the API anywhere. Asyncore is different: you do need to know how sockets work in order to use it, because you're expected to call .send() and .recv() yourself. (And, in my opinion, this is a serious design flaw, for reasons which will hopefully be elucidated in the PEP that Laurens is now writing.) Second, it makes me a little sad that it appears to be folk wisdom that Twisted is only for servers. A lot of work has gone into making it equally appropriate for clients. This is especially true if your client has a GUI, where Twisted is often better than a protocol-specific library, which may either be blocking or have its own ad-hoc event loop. I don't have an opinion on the socket HOWTO per se, only on the possibility of linking to Twisted as an alternate implementation mechanism. It really would be better to say "go use Twisted rather than reading any of the following" than "read the following, which will help you understand Twisted".

Hmm. Are you saying it doesn't use listen, connect, bind, send, recv? To me, that's the core of BSD sockets. I can understand it doesn't use select(2).
And that's all fine. I still claim that you have to *understand* sockets in order to use it properly. By this, I mean stuff like "what is a TCP connection? how is it established?", "how is UDP different from TCP?", "when data arrives, what layers of software does it go through?", "what is a port number?", etc.
I think that's because many of the problems that Twisted solves don't exist in many of the client applications - in particular, you often don't have many simultaneous connections. GUI apps may be the interesting special case, but I expect that people dealing with these rather use separate threads.
Wouldn't you agree that Twisted is very difficult to learn, and thus much heavier than sockets? And I don't blame the Twisted API for that, but rather the mental model of overlapping activities that people have severe problems with. Regards, Martin

On 5 Jun, 10:35 pm, martin@v.loewis.de wrote:
Yes, that's correct. Those aren't the best APIs to use on Windows, so they aren't necessarily used on Windows.
These may be good things to understand. The current socket howto doesn't explain them, though.
On the contrary, many of the problems do exist in client applications (every time I have to use virt-manager I want to throw it out a window). Some people certainly would rather use threading, but that doesn't say anything about whether Twisted solves problems relevant to clients, only about the fact that a lot of people like to use threads.
This discussion has a significant problem, in taking "Twisted" as a monolithic all-or-nothing entity. Restricting the scope to merely the lowest-level socket replacement APIs - ie, the bare TCP, UDP, etc functionality - no, Twisted is not very difficult to learn. Expanding the scope to include the higher level functionality, it is much easier to learn than reimplementing line parsing and concurrency and so forth. However, does that really have anything to do with improving the socket howto? The Python documentation can include a clear explanation of what functionality the socket module provides - without forcing you to read Stevens _or_ use Twisted, but it can still refer you to both, since it is very likely that you'll need at least one of them in addition to the socket module. Jean-Paul

On Jun 5, 2011, at 3:35 PM, Martin v. Löwis wrote:
Yes, these are all excellent concepts to be familiar with. But the word "socket" (and the socket HOWTO) refers to a specific way to interface with those concepts, the Berkeley socket API: <http://en.wikipedia.org/wiki/Berkeley_sockets>. Which you don't have to know anything about if you're going to use Twisted. You should know about IPC in general, and TCP/UDP specifically if you're going to use Twisted, but sockets are completely optional. Also, I feel that I should point out that the sockets HOWTO does not cover even a single one of these concepts in any useful depth. If you think that these are what it should be explaining, it needs some heavy editing. Here's what it has to say about each one:
what is a TCP connection?
The only place that the characters "TCP" appear in the entire document is in the phrase "... which is completely different from TCP_NODELAY ...". Nowhere is a TCP connection explained at a conceptual level, except to say that it's something a web browser does.
how is UDP different from TCP?
The phrase "UDP" never appears in the HOWTO. DGRAM sockets get a brief mention as "anything else" in the sentence: "... you’ll get better behavior and performance from a STREAM socket than anything else ...". (To be fair, I do endorse teaching that "the difference between TCP and UDP is that you should not use UDP" to anyone not sufficiently advanced to read the relevant reference documentation themselves.)
when data arrives, what layers of software does it go through?
There's no discussion of this that I can find at all.
what is a port number?
Aside from a few comments in the code examples, the only discussion of port numbers is "low number ports are usually reserved for “well known” services (HTTP, SNMP etc)." It would be very good to have a "Python networking overview" somewhere that explained this stuff at a very high level, and described how data might get into or out of your program, with links to things like the socket HOWTO that describe more specific techniques. This would be useful because most commonly, I think that data will get into Python network programs via WSGI, not direct sockets or anything like Twisted. To be clear, having read it now: I do _not_ agree with Antoine that this document should be deleted. I dimly recall that it helped me understand some things in the very early days of Twisted. While it's far from perfect, it might help someone in a similar situation understand those things as well today. I just found it interesting that the main concepts one would associate with such a HOWTO are nowhere to be found :). -glyph

<snip>
Just be careful not to reproduce http://www.apress.com/9781590593714 :-) These things tend to get out of hand very quickly. Eli

On Wed, Jun 8, 2011 at 3:37 AM, Eli Bendersky <eliben@gmail.com> wrote:
Just be careful not to reproduce http://www.apress.com/9781590593714 :-) These things tend to get out of hand very quickly.
At the level Glyph and Martin are talking about, you're more likely to end up with http://authors.phptr.com/tanenbaumcn4/ :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Jun 7, 2011 at 10:37 AM, Eli Bendersky <eliben@gmail.com> wrote:
Just be careful not to reproduce http://www.apress.com/9781590593714 :-) These things tend to get out of hand very quickly.
You say that like it's a bad thing. The first few chapters of that would make a great replacement for the howto. Geremy Condra

On Wed, Jun 8, 2011 at 21:07, geremy condra <debatem1@gmail.com> wrote:
Not a bad thing at all, and I'm sorry if I made it sound that way. I just meant that it may turn into a *whole book* if too many details are added. I had no intention to criticize this specific book. Frankly I didn't even read it, I just remembered that a book with this title came out recently. Eli

On Tue, Jun 7, 2011 at 1:54 PM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:
And if UDP starts sounding tempting due to excessively high latency, these days it's worth looking up the specs for the interplanetary internet instead. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (17)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
C McL
-
Eli Bendersky
-
exarkun@twistedmatrix.com
-
Georg Brandl
-
geremy condra
-
Glyph Lefkowitz
-
Gregory P. Smith
-
Neil Hodgson
-
Nick Coghlan
-
Raymond Hettinger
-
Ross Lagerwall
-
Senthil Kumaran
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Łukasz Langa