[Python-Dev] Asking for feedback about fixing `ftplib' (issue25458)
Ivan Pozdeev
vano at mail.mipt.ru
Wed Dec 28 09:57:12 EST 2016
Re: [Python-Dev] Asking for feedback about fixing `ftplib' (issue25458)
On 25.12.2016 17:21, Giampaolo Rodola' wrote:
> From what I understand the problem should be fixed in urllib so that
> it always closes the FTP connection. I understand this is what happens
> in recent 3.x versions but not on 2.7.
urllib in 2.7 should indeed be fixed to close FTP connections rather
than leave them dangling, no question about that.
(ftpcache probably has to go in all branches as a result since it's now
useless - but that's another issue entirely)
The question is, in both 2.x and 3.x, urllib currently doesn't close the
control connection gracefully. It just closes the socket immediately
after reading all data on the data connection - without issuing QUIT or
even reading the end-of-transfer response.
By doing this, it gets away with not calling voidresp().
(RFC 939 specifies that the server should execute the equivalent of ABOR
and QUIT upon unexpected close of control connection.)
> As for ftplib, I don't like the solution of using a
> `transfer_in_progress` flag in all commands. I see that as a cheap way
> to work around the fact that the user may forget to call voidresp()
> and close the socket after nt/transfercmd(). It probably makes sense
> to just update nt/transfercmd() doc instead and make that clear.
I tried to fix urllib so that it's guaranteed to call voidresp(), and
failed. Too complicated, effectively reimplementing a sizable part of
FTP client logic is required, perhaps even monkey-patching ftplib to be
able to set flags in the right places. That's why I thought that just
requiring the user to call it themselves and calling it a day would be
asking too much from users and the easy way out - ftplib currently dumps
too much work on its users that it ought to do itself instead.
The fact that in FTP, one cannot send another command before the
previous one is complete (or rather, one can, but the server won't
respond to it until then) means that FTP is inherently stateful. So,
ftplib, to be a usable library, needs to either encapsulate this
statefulness, or provide building blocks with all the stock logic parts
so that only truly application-specific logic has to be added to make a
robust solution.
With simple commands (|voidcmd|) and self-contained transfer commands
(retrlines/retrbinary), ftplib does incapsulate statefulness, by
handling them atomically. But with transfercmd(), it returns control to
the user halfway through.
At this point, if ftplib doesn't itself keep track that the control
connection is currently in invalid state for the next command, the user
is forced to do that themselves instead. And guess what - urllib has to
use ftplib through a wrapper class that does exactly that. That's
definitely a "stock logic part" 'cuz it's an integral part of FTP rather
than something specific to user logic.
The other "stock logic part" currently missing is error handling during
transfer. If the error's cause is local (like a local exception that
results in closing the data socket prematurely, or the user closing it
by hand, or an ABOR they sent midway), the error code from the
end-of-transfer response should be ignored, otherwise, it shouldn't.
Nothing currently provides this logic. ftplib.retr*() never ignore the
code - because they never abort the transfer - but they don't handle
exceptions, either, so they wouldn't read the response if one is raised
on data socket read or in the callback. urllib used to handle the
response in an overridden socket close handler, and it was forced to
always ignore the code because it's impossible to check from there if
there was an exception or if the data socket was closed prematurely.
--
Other than making ftplib keep track of the session's internal state
while the user has control and providing the custom error handling logic
to them, another solution is to never release control to the user
midway, i.e. ban transfercmd() entirely and only provide self-contained
retr*()/dir()/whatever, but let the callback signal them to abort the
transfer. That way, the state would be kept implicitly - in the
execution pointer.
> --
> Giampaolo - http://grodola.blogspot.com
>
--
Regards,
Ivan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20161228/318b7763/attachment.html>
More information about the Python-Dev
mailing list