How do you FTP ASCII if ftplib strips CR?

David Bolen db3l at fitlinxx.com
Tue Jun 13 15:53:48 EDT 2000


al at servana.com writes:

> If you look at the FTPLIB docs, it states that RETRLINES will strip 
> CR when passing the text to the RETRLINES callback.

Actually, it strips the end of line entirely (which is CRLF in ASCII
FTP transfers).  Presumably that's to simplify the processing of the
lines, since you already know that each entry in the list (or each
string supplied to the callback) is a "line" from the transfer.

Within the FTP stream, every line of an ASCII transfer will always end
with CRLF (by specification - see below).

> So, how do you put the CR back in if you want them?

You could just add it whenever you need.  Note that if you want to
store the ASCII file locally using the proper line ending for your
platform, the most portable approach is probably to use os.linesep as
the separator (although I think using "\n" in a string and letting the
underlying C library perform the platform translation would probably
work just about as portably).

> When you FTP ascii from a non-similar platform, the loss of the CR 
> causes the text to FTP as 'one line'. (like UNIX-to-UNIX and NT-to-NT 
> work, UNIX-to-NT does not)

This has to depend on what you are doing with the retrieved lines.
The retrlines() method will return a list of individual lines, so they
can't be considered 'one line' unless you dump them in binary format
to some file or location that expected CRLF line endings.

(I wrote this before getting to the "write" call you use below, which
is why you get the raw data in the file)

> Yet ALL FTP tools do not do this, so only the Python scripts tend to 
> munge ASCII files.

Well, in reality, all FTP tools do really perform some end of line
processing for ASCII transfers - although you normally don't see it
because it's beneath the covers.

By definition, ASCII transfer in FTP is performed using a particular
NVT-ASCII representation (defined in the Telnet specification).  The
FTP RFC (RFC959) actually requires that all end of lines within an
ASCII transfer must be represented by CRLF.  That means that if you
are retrieving a file in ASCII mode from a Unix platform, the FTP
server on that platform actually _adds_ a CR to the ending of every
line before transferring it over the network.

Of course with the other tools you mention, you are probably just
retrieving a file to a local copy, so all you see is the final local
copy written to disk, at which point the FTP tool, knowing it's an
ASCII transfer, will have already applied the right end of line
meaning for the local platform.  But any FTP tool has to understand
how to potentially translate a local line ending format to the network
format required for the transfer.

> I was using the following notation, but cannot come up with another 
> where I can control the append of a CR:
> 
> ourFTP.retrlines('RETR ' + filename, open(os.path.join(localdir, 
> filename), 'w').write)

Your problem is using the "write" method on a file, which just writes
a buffer of data without any consideration for text "lines".  As an
aside, note that by doing the open inside the retrlines() call
that you lost access to the open object, and thus can't close() it.
It's possible that some data may not get flushed to the file until
your process exits.  I'd probably open the file separately and close
it after the retrieval.

What I would suggest instead is defining a small class to handle
access to your output file that takes each line string and writes it
to your file as a textual line, including line ending - something
like:

    class AsciiFile:
	import os

        def __init__(self,name=None,mode=''):
            if name:
                self.open(name,mode+'b')
            else:
                self.file = None

        def open(self,name,mode):
            self.file = open(name,mode+'b')

        def writeline(self,line):
            self.file.write(line)
            self.file.write(os.linesep)

        def close(self):
            self.file.close()
            self.file = None

and then code like:

    output = AsciiFile().open(os.path.join(localdir,filename),'w')
    outFTP.retrlines('RETR ' + filename,output.writeline)
    output.close()

should do what you want.  Note that I've chosen to have the class
actually process the file in binary and add the full line separator.
This should be portable to all platforms.  I believe you could also
just open the file in text mode and always append the string '\n' to
each line and it'll translate to the appropriate text line ending on
each platform, but I'm only positive that's going to work on Unix and
Windows.

Of course, none of this would be necessary if somewhere along the
lines each of the various systems had agreed on just how a textual
line should be terminated, but this is what we're stuck with.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/



More information about the Python-list mailing list