PEP 3156: getting the socket or peer name from the transport
A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.) What should the interface for this look like? I can think of several ways: A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer. B) An API to get the address and peer address; e.g. transport.getsockname() and transport.getpeername(). These would call the corresponding call on the underlying socket, if there is one, or return None otherwise; IP transports that don't use sockets would be free to retrieve and return the requested information in a platform-specific way. Note that the address may take different forms; e.g. for AF_UNIX sockets it is a filename, so the protocol must be prepared for different formats. C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity. Discussion? -- --Guido van Rossum (python.org/~guido)
On Jan 24, 2013, at 10:23 AM, Guido van Rossum
C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity.
SocketTransport could be abstract just like Transport class, just for description purpose. Another question, should we expect ability to use protocols on top of different transports (i.e. HTTPProtocol and UnixSubprocessTransport) ?
On Thu, Jan 24, 2013 at 10:41 AM, Nikolay Kim
On Jan 24, 2013, at 10:23 AM, Guido van Rossum
wrote: C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity.
SocketTransport could be abstract just like Transport class, just for description purpose.
Yes, but I'm arguing against this. :-)
Another question, should we expect ability to use protocols on top of different transports (i.e. HTTPProtocol and UnixSubprocessTransport) ?
Yes, it should be possible, for example the subprocess might implement some kind of custom tunnel. If in this case there's no way to get the socket or peer name, or if the names aren't very useful, that's okay. -- --Guido van Rossum (python.org/~guido)
On Thu, Jan 24, 2013 at 8:23 PM, Guido van Rossum
A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
I feel (A) is the best option as it's the most flexible - underlying transports can have many different special methods. No? Yuval Greenfield
On Thu, Jan 24, 2013 at 10:45 AM, Yuval Greenfield
On Thu, Jan 24, 2013 at 8:23 PM, Guido van Rossum
wrote: A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
I feel (A) is the best option as it's the most flexible - underlying transports can have many different special methods. No?
The whole idea of defining a transport API is that the protocol shouldn't care about what type of transport it is being used with. The example of using an http client protocol with a subprocess transport that invokes some kind of tunneling process might clarify this. So I would like the transport API to be both small and fixed, rather than having different transports have different extensions to the standard transport API. What other things might you want to do with the socket besides calling getpeername() or getsockname()? Would that be reasonable to expect from a protocol written to be independent of the specific transport type? -- --Guido van Rossum (python.org/~guido)
On Jan 24, 2013, at 10:50 AM, Guido van Rossum
On Thu, Jan 24, 2013 at 10:45 AM, Yuval Greenfield
wrote: On Thu, Jan 24, 2013 at 8:23 PM, Guido van Rossum
wrote: A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
I feel (A) is the best option as it's the most flexible - underlying transports can have many different special methods. No?
The whole idea of defining a transport API is that the protocol shouldn't care about what type of transport it is being used with. The example of using an http client protocol with a subprocess transport that invokes some kind of tunneling process might clarify this. So I would like the transport API to be both small and fixed, rather than having different transports have different extensions to the standard transport API.
What other things might you want to do with the socket besides calling getpeername() or getsockname()? Would that be reasonable to expect from a protocol written to be independent of the specific transport type?
transport could have dictionary attribute where it can store optional information like socket name, peer name or file path, etc.
On Thu, Jan 24, 2013 at 11:05 AM, Nikolay Kim
On Jan 24, 2013, at 10:50 AM, Guido van Rossum
wrote: On Thu, Jan 24, 2013 at 10:45 AM, Yuval Greenfield
wrote: On Thu, Jan 24, 2013 at 8:23 PM, Guido van Rossum
wrote: A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
I feel (A) is the best option as it's the most flexible - underlying transports can have many different special methods. No?
The whole idea of defining a transport API is that the protocol shouldn't care about what type of transport it is being used with. The example of using an http client protocol with a subprocess transport that invokes some kind of tunneling process might clarify this. So I would like the transport API to be both small and fixed, rather than having different transports have different extensions to the standard transport API.
What other things might you want to do with the socket besides calling getpeername() or getsockname()? Would that be reasonable to expect from a protocol written to be independent of the specific transport type?
transport could have dictionary attribute where it can store optional information like socket name, peer name or file path, etc.
Aha, that makes some sense. Though maybe it shouldn't be a dict -- it may be expensive to populate some values in some cases, so maybe there should just be a method transport.get_extra_info('key') which computes and returns (and possibly caches) certain values but returns None if the info is not supported. E.g. get_extra_info('name'), get_extra_info('peer'). This API makes it pretty clear that the caller should check the value for None before using it. -- --Guido van Rossum (python.org/~guido)
On Fri, Jan 25, 2013 at 5:12 AM, Guido van Rossum
On Thu, Jan 24, 2013 at 11:05 AM, Nikolay Kim
wrote: transport could have dictionary attribute where it can store optional information like socket name, peer name or file path, etc.
Aha, that makes some sense. Though maybe it shouldn't be a dict -- it may be expensive to populate some values in some cases, so maybe there should just be a method transport.get_extra_info('key') which computes and returns (and possibly caches) certain values but returns None if the info is not supported. E.g. get_extra_info('name'), get_extra_info('peer'). This API makes it pretty clear that the caller should check the value for None before using it.
A "get_extra_info" API like that is also amenable to providing an explicit default for the "key not present" case, and makes it clearer that the calculations involved may not be cheap. You could even go so far as to have it return a Future, allowing it to be used for info that requires network activity. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Jan 24, 2013 at 12:51 PM, Nick Coghlan
On Fri, Jan 25, 2013 at 5:12 AM, Guido van Rossum
wrote: On Thu, Jan 24, 2013 at 11:05 AM, Nikolay Kim
wrote: transport could have dictionary attribute where it can store optional information like socket name, peer name or file path, etc.
Aha, that makes some sense. Though maybe it shouldn't be a dict -- it may be expensive to populate some values in some cases, so maybe there should just be a method transport.get_extra_info('key') which computes and returns (and possibly caches) certain values but returns None if the info is not supported. E.g. get_extra_info('name'), get_extra_info('peer'). This API makes it pretty clear that the caller should check the value for None before using it.
A "get_extra_info" API like that is also amenable to providing an explicit default for the "key not present" case, and makes it clearer that the calculations involved may not be cheap.
Yeah, the signature could be get_extra_info(key, default=None).
You could even go so far as to have it return a Future, allowing it to be used for info that requires network activity.
I think that goes too far. It doesn't look like getpeername() goes out to the network -- what other use case did you have in mind? (I suppose it could use a Future for some keys only -- but then the caller would still need to be aware that it could return None instead of a Future, so it would be somewhat awkward to use -- you couldn't write remote_user = yield from self.transport.get_extra_info("remote_user") you'd have to write f = yield from self.transport.get_extra_info("remote_user") remote_user = (yield from f) if f else None -- --Guido van Rossum (python.org/~guido)
On Fri, Jan 25, 2013 at 7:50 AM, Guido van Rossum
I think that goes too far. It doesn't look like getpeername() goes out to the network -- what other use case did you have in mind?
I don't have one, so YAGNI sounds like a good answer to me. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 24 Jan, 2013, at 22:50, Guido van Rossum
On Thu, Jan 24, 2013 at 12:51 PM, Nick Coghlan
wrote: On Fri, Jan 25, 2013 at 5:12 AM, Guido van Rossum
wrote: On Thu, Jan 24, 2013 at 11:05 AM, Nikolay Kim
wrote: transport could have dictionary attribute where it can store optional information like socket name, peer name or file path, etc.
Aha, that makes some sense. Though maybe it shouldn't be a dict -- it may be expensive to populate some values in some cases, so maybe there should just be a method transport.get_extra_info('key') which computes and returns (and possibly caches) certain values but returns None if the info is not supported. E.g. get_extra_info('name'), get_extra_info('peer'). This API makes it pretty clear that the caller should check the value for None before using it.
A "get_extra_info" API like that is also amenable to providing an explicit default for the "key not present" case, and makes it clearer that the calculations involved may not be cheap.
Yeah, the signature could be get_extra_info(key, default=None).
You could even go so far as to have it return a Future, allowing it to be used for info that requires network activity.
I think that goes too far. It doesn't look like getpeername() goes out to the network -- what other use case did you have in mind? (I suppose it could use a Future for some keys only -- but then the caller would still need to be aware that it could return None instead of a Future, so it would be somewhat awkward to use -- you couldn't write
A transport that tunnels traffic over a SOCKS or SSH tunnel might require network access to get the sockname or peername of the proxied connection. I don't know enough about either protocol to know for sure, and the information could also be fetched during connection setup and then cached. Ronald
On Fri, Jan 25, 2013 at 3:24 AM, Ronald Oussoren
On 24 Jan, 2013, at 22:50, Guido van Rossum
wrote: On Thu, Jan 24, 2013 at 12:51 PM, Nick Coghlan
wrote: On Fri, Jan 25, 2013 at 5:12 AM, Guido van Rossum
wrote: On Thu, Jan 24, 2013 at 11:05 AM, Nikolay Kim
wrote: transport could have dictionary attribute where it can store optional information like socket name, peer name or file path, etc.
Aha, that makes some sense. Though maybe it shouldn't be a dict -- it may be expensive to populate some values in some cases, so maybe there should just be a method transport.get_extra_info('key') which computes and returns (and possibly caches) certain values but returns None if the info is not supported. E.g. get_extra_info('name'), get_extra_info('peer'). This API makes it pretty clear that the caller should check the value for None before using it.
A "get_extra_info" API like that is also amenable to providing an explicit default for the "key not present" case, and makes it clearer that the calculations involved may not be cheap.
Yeah, the signature could be get_extra_info(key, default=None).
You could even go so far as to have it return a Future, allowing it to be used for info that requires network activity.
I think that goes too far. It doesn't look like getpeername() goes out to the network -- what other use case did you have in mind? (I suppose it could use a Future for some keys only -- but then the caller would still need to be aware that it could return None instead of a Future, so it would be somewhat awkward to use -- you couldn't write
A transport that tunnels traffic over a SOCKS or SSH tunnel might require network access to get the sockname or peername of the proxied connection. I don't know enough about either protocol to know for sure, and the information could also be fetched during connection setup and then cached.
Sounds good (to fetch it proactively ahead of time, rather than inject a Future into the API). -- --Guido van Rossum (python.org/~guido)
I think Transport needs 'sendfile' api, something like: @tasks.coroutine def sendfile(self, fd, offset, nbytes): …. otherwise it is impossible to implement sendfile without breaking transport encapsulation.
On Fri, Jan 25, 2013 at 10:03 AM, Nikolay Kim
I think Transport needs 'sendfile' api, something like:
@tasks.coroutine def sendfile(self, fd, offset, nbytes): ….
otherwise it is impossible to implement sendfile without breaking transport encapsulation
Really? Can't the user write this themselves? What's wrong with this: while True: data = os.read(fd, 16*1024) if not data: break transport.write(data) (Perhaps augmented with a way to respond to pause() requests.) -- --Guido van Rossum (python.org/~guido)
On Jan 25, 2013, at 10:08 AM, Guido van Rossum
On Fri, Jan 25, 2013 at 10:03 AM, Nikolay Kim
wrote: I think Transport needs 'sendfile' api, something like:
@tasks.coroutine def sendfile(self, fd, offset, nbytes): ….
otherwise it is impossible to implement sendfile without breaking transport encapsulation
Really? Can't the user write this themselves? What's wrong with this:
while True: data = os.read(fd, 16*1024) if not data: break transport.write(data)
(Perhaps augmented with a way to respond to pause() requests.)
i mean 'os.sendfile()', zero-copy sendfile.
On Fri, Jan 25, 2013 at 10:11 AM, Nikolay Kim
On Jan 25, 2013, at 10:08 AM, Guido van Rossum
wrote: On Fri, Jan 25, 2013 at 10:03 AM, Nikolay Kim
wrote: I think Transport needs 'sendfile' api, something like:
@tasks.coroutine def sendfile(self, fd, offset, nbytes): ….
otherwise it is impossible to implement sendfile without breaking transport encapsulation
Really? Can't the user write this themselves? What's wrong with this:
while True: data = os.read(fd, 16*1024) if not data: break transport.write(data)
(Perhaps augmented with a way to respond to pause() requests.)
i mean 'os.sendfile()', zero-copy sendfile.
I see (http://docs.python.org/dev/library/os.html#os.sendfile). Hm, that function is so platform-specific that we might as well force users to do it this way: sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else:
On Jan 25, 2013, at 12:04 PM, Guido van Rossum
On Fri, Jan 25, 2013 at 10:11 AM, Nikolay Kim
wrote: On Jan 25, 2013, at 10:08 AM, Guido van Rossum
wrote: On Fri, Jan 25, 2013 at 10:03 AM, Nikolay Kim
wrote: I think Transport needs 'sendfile' api, something like:
@tasks.coroutine def sendfile(self, fd, offset, nbytes): ….
otherwise it is impossible to implement sendfile without breaking transport encapsulation
Really? Can't the user write this themselves? What's wrong with this:
while True: data = os.read(fd, 16*1024) if not data: break transport.write(data)
(Perhaps augmented with a way to respond to pause() requests.)
i mean 'os.sendfile()', zero-copy sendfile.
I see (http://docs.python.org/dev/library/os.html#os.sendfile).
Hm, that function is so platform-specific that we might as well force users to do it this way:
sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else:
there should some kind of way to flush write buffer or write callbacks. sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else: yield from transport.write_buffer_flush()
On Fri, Jan 25, 2013 at 12:25 PM, Nikolay Kim
On Jan 25, 2013, at 12:04 PM, Guido van Rossum
wrote: On Fri, Jan 25, 2013 at 10:11 AM, Nikolay Kim
wrote: On Jan 25, 2013, at 10:08 AM, Guido van Rossum
wrote: On Fri, Jan 25, 2013 at 10:03 AM, Nikolay Kim
wrote: I think Transport needs 'sendfile' api, something like:
@tasks.coroutine def sendfile(self, fd, offset, nbytes): ….
otherwise it is impossible to implement sendfile without breaking transport encapsulation
Really? Can't the user write this themselves? What's wrong with this:
while True: data = os.read(fd, 16*1024) if not data: break transport.write(data)
(Perhaps augmented with a way to respond to pause() requests.)
i mean 'os.sendfile()', zero-copy sendfile.
I see (http://docs.python.org/dev/library/os.html#os.sendfile).
Hm, that function is so platform-specific that we might as well force users to do it this way:
sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else:
there should some kind of way to flush write buffer or write callbacks.
sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else: yield from transport.write_buffer_flush()
Oh, that's an interesting idea in its own right. But I'm not sure Twisted could implement this given that their flow control works differently. However, I think you've convinced me that offering sendfile() is actually better. But should it take a file descriptor or a stream (file) object? -- --Guido van Rossum (python.org/~guido)
In principle os.sendfile() is not too different than socket.send():
they share the same return value (no. of bytes sent) and errors, hence
it's pretty straightforward to implement (the user could even just
override Transport.write() him/herself).
Nonetheless there are other subtle differences (e.g. it works with
regular (mmap-like) files only) so that deciding whether to use send()
or sendfile() behind the curtains is not a good idea.
Transport class should probably provide a separate method (other than write()).
Also, I think that *at this point* thinking about adding sendfile()
into Tulip is probably premature.
--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/
2013/1/25 Nikolay Kim
On Jan 25, 2013, at 12:04 PM, Guido van Rossum
wrote: On Fri, Jan 25, 2013 at 10:11 AM, Nikolay Kim
wrote: On Jan 25, 2013, at 10:08 AM, Guido van Rossum
wrote: On Fri, Jan 25, 2013 at 10:03 AM, Nikolay Kim
wrote: I think Transport needs 'sendfile' api, something like:
@tasks.coroutine def sendfile(self, fd, offset, nbytes): ….
otherwise it is impossible to implement sendfile without breaking transport encapsulation
Really? Can't the user write this themselves? What's wrong with this:
while True: data = os.read(fd, 16*1024) if not data: break transport.write(data)
(Perhaps augmented with a way to respond to pause() requests.)
i mean 'os.sendfile()', zero-copy sendfile.
I see (http://docs.python.org/dev/library/os.html#os.sendfile).
Hm, that function is so platform-specific that we might as well force users to do it this way:
sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else:
there should some kind of way to flush write buffer or write callbacks.
sock = transport.get_extra_info("socket") if sock is not None: os.sendfile(sock.fileno(), ......) else: yield from transport.write_buffer_flush()
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Sorry for the delay, it took me a while to read
http://code.google.com/p/tulip/source/browse/ and wrap my head around it.
On Thu, Jan 24, 2013 at 8:50 PM, Guido van Rossum
What other things might you want to do with the socket besides calling getpeername() or getsockname()?
From http://en.wikipedia.org/wiki/Berkeley_sockets#Options_for_sockets
Options for sockets
After creating a socket, it is possible to set options on it. Some of the more common options are:
TCP_NODELAY disables the Nagle algorithm. SO_KEEPALIVE enables periodic 'liveness' pings, if supported by the OS.
Though these may not be the concern of a protocol as defined by PEP 3156. Would that be reasonable to expect
from a protocol written to be independent of the specific transport type?
Most protocols should be written independent of transport. But it seems to me that a user might write an entire app as a "protocol". Yuval Greenfield
On Sun, 27 Jan 2013 10:16:14 +0200
Yuval Greenfield
From http://en.wikipedia.org/wiki/Berkeley_sockets#Options_for_sockets
Options for sockets
After creating a socket, it is possible to set options on it. Some of the more common options are:
TCP_NODELAY disables the Nagle algorithm. SO_KEEPALIVE enables periodic 'liveness' pings, if supported by the OS.
Though these may not be the concern of a protocol as defined by PEP 3156.
How about e.g. TCP_CORK?
Would that be reasonable to expect from a protocol written to be independent of the specific transport type?
Most protocols should be written independent of transport. But it seems to me that a user might write an entire app as a "protocol".
Well, such an assumption can fall flat. For example, certificate checking in HTTPS expects that the transport is some version of TLS or SSL: http://tools.ietf.org/html/rfc2818.html#section-3.1 Regards Antoine.
On Tornado we basically do A (the IOStream's socket attribute was never
really documented for public consumption but has become the de facto
standard way to get this kind of information). As food for thought,
consider extending this to include not just peer address but also SSL
certificates. Tornado's SSL support uses the stdlib's ssl.SSLSocket, so
the certificate is available from the socket object, but Twisted (I
believe) uses pycrypto and things work differently there. To expose SSL
certificates (and NPN, and other information that may or may not be there
depending on SSL implementation) across both tornado- and twisted-based
transports you'd need something like B or C.
-Ben
On Thu, Jan 24, 2013 at 1:23 PM, Guido van Rossum
A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
B) An API to get the address and peer address; e.g. transport.getsockname() and transport.getpeername(). These would call the corresponding call on the underlying socket, if there is one, or return None otherwise; IP transports that don't use sockets would be free to retrieve and return the requested information in a platform-specific way. Note that the address may take different forms; e.g. for AF_UNIX sockets it is a filename, so the protocol must be prepared for different formats.
C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity.
Discussion?
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Thu, Jan 24, 2013 at 11:14 AM, Ben Darnell
On Tornado we basically do A (the IOStream's socket attribute was never really documented for public consumption but has become the de facto standard way to get this kind of information). As food for thought, consider extending this to include not just peer address but also SSL certificates. Tornado's SSL support uses the stdlib's ssl.SSLSocket, so the certificate is available from the socket object, but Twisted (I believe) uses pycrypto and things work differently there. To expose SSL certificates (and NPN, and other information that may or may not be there depending on SSL implementation) across both tornado- and twisted-based transports you'd need something like B or C.
Excellent points all. I'll mull this over -- it's unfortunate that (A) is so easy to do and handles future needs as well, but may shut out alternate transport implementations...
-Ben
On Thu, Jan 24, 2013 at 1:23 PM, Guido van Rossum
wrote: A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
B) An API to get the address and peer address; e.g. transport.getsockname() and transport.getpeername(). These would call the corresponding call on the underlying socket, if there is one, or return None otherwise; IP transports that don't use sockets would be free to retrieve and return the requested information in a platform-specific way. Note that the address may take different forms; e.g. for AF_UNIX sockets it is a filename, so the protocol must be prepared for different formats.
C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity.
Discussion?
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- --Guido van Rossum (python.org/~guido)
Starting to seem like the transport could almost be an entry in the dictionary rather than owning it, kind of like environ['input'] in wsgi spec. Not that I'm necessarily recommending this, but it seems like the details may outlive the transports, could potentially include information the transport itself considered input, and may be a useful place to store details such as SSL details that might be shared. A lot of these details could be initialized when the transport was created, and many would be based on the whatever spawned it. For example, a transport spawned by an HTTPS server that accepted an incoming connection would inherit the SSL configuration, etc.
Shane Green
www.umbrellacode.com
805-452-9666 | shane@umbrellacode.com
On Jan 24, 2013, at 11:14 AM, Ben Darnell
On Tornado we basically do A (the IOStream's socket attribute was never really documented for public consumption but has become the de facto standard way to get this kind of information). As food for thought, consider extending this to include not just peer address but also SSL certificates. Tornado's SSL support uses the stdlib's ssl.SSLSocket, so the certificate is available from the socket object, but Twisted (I believe) uses pycrypto and things work differently there. To expose SSL certificates (and NPN, and other information that may or may not be there depending on SSL implementation) across both tornado- and twisted-based transports you'd need something like B or C.
-Ben
On Thu, Jan 24, 2013 at 1:23 PM, Guido van Rossum
wrote: A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.) What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
B) An API to get the address and peer address; e.g. transport.getsockname() and transport.getpeername(). These would call the corresponding call on the underlying socket, if there is one, or return None otherwise; IP transports that don't use sockets would be free to retrieve and return the requested information in a platform-specific way. Note that the address may take different forms; e.g. for AF_UNIX sockets it is a filename, so the protocol must be prepared for different formats.
C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity.
Discussion?
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Thu, 24 Jan 2013 10:23:40 -0800
Guido van Rossum
A pragmatic question popped up: sometimes the protocol would like to know the name of the socket or its peer, i.e. call getsockname() or getpeername() on the underlying socket. (I can imagine wanting to log this, or do some kind of IP address blocking.)
What should the interface for this look like? I can think of several ways:
A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
I don't understand why you say Windows doesn't use sockets for IP connections. AFAIK, sockets are the *only* way to do networking with the Windows API. See e.g. WSARecv, which supports synchronous and asynchronous operation: http://msdn.microsoft.com/en-us/library/windows/desktop/ms741688%28v=vs.85%2... (I also suppose you meant "TCP connection", not "IP connection" ;-)) That said, the problem with returning a socket is that it's quite low-level, and might return sockets with different characteristics depending on the backend. So, while it can be there, I think the preferred APIs for most uses should be B or C.
C) Similar to (A) or (B), but putting the API in an abstract subclass of Transport (e.g. SocketTransport) so that a transport that doesn't have this doesn't need to implement dummy methods returning None -- it is now the protocol's responsibility to check for isinstance(transport, SocketTransport) before calling the method. I'm not so keen on this, Twisted has shown (IMO) that a deep hierarchy of interfaces or ABCs does not necessarily provide clarity.
IMO, Twisted mostly shows that zope.interface doesn't combine very well with automated doc generators such as epydoc (you have to look up the interface every time you want the documentation of one of the concrete classes). And as Ben says, I don't think you want to enumerate all possible introspection APIs (such as the various pieces of SSL-related information) on the base Transport class. Regards Antoine.
Antoine Pitrou wrote:
On Thu, 24 Jan 2013 10:23:40 -0800 Guido van Rossum
wrote: A) An API to return the underlying socket, if there is one. (In the case of a stack of transports and protocols there may not be one, so it may return None.) Downside is that it requires the transport to use sockets -- if it were to use some native Windows API there might not be a socket object even though there might be an IP connection with easily-accessible address and peer.
I don't understand why you say Windows doesn't use sockets for IP connections. AFAIK, sockets are the *only* way to do networking with the Windows API. See e.g. WSARecv, which supports synchronous and asynchronous operation: http://msdn.microsoft.com/en- us/library/windows/desktop/ms741688%28v=vs.85%29.aspx
There's also a whole selection of "Internet" APIs that could be used http://msdn.microsoft.com/en-us/library/hh309468.aspx and plenty (probably too many) other high level APIs. There's no expectation that every application has to deal solely in sockets. Cheers, Steve
participants (10)
-
Antoine Pitrou
-
Ben Darnell
-
Giampaolo Rodolà
-
Guido van Rossum
-
Nick Coghlan
-
Nikolay Kim
-
Ronald Oussoren
-
Shane Green
-
Steve Dower
-
Yuval Greenfield