[Patches] [ python-Patches-497736 ] smtplib.py SMTP EHLO/HELO correct

noreply@sourceforge.net noreply@sourceforge.net
Mon, 25 Mar 2002 09:41:45 -0800


Patches item #497736, was opened at 2001-12-29 20:20
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=497736&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Eduardo Pérez (eperez)
Assigned to: Neil Schemenauer (nascheme)
Summary: smtplib.py SMTP EHLO/HELO correct

Initial Comment:
If the machine from you are sending mail doesn't have a
FQDN and the mail server requires a FQDN in HELO the
current code will fail.

Resolving the name it's a very bad idea:
- It's something from other layer (DNS/IP) not from SMTP
- It breaks when the name of the computer is not FQDN
(as many dial-ins do) and the SMTP server does strict
EHLO/HELO checking as stated before.
- It breaks computers with a TCP tunnel to another host
from the connection is originated if the relay does
strict EHLO/HELO checking.
- It breaks computers using NAT, the host that sees the
server is not the one that sends the message if the
relay does strict EHLO/HELO checking.
- It's considered spyware as you are sending
information some companies or people don't want to say:
the internal structure of the network.

No important mail client resolves the name. Look at
netscape messenger or kmail. In fact kmail and perl's
Net::SMTP does exactly what my patch does.

Please don't resolve the names, as this approach works
and the most used email clients do this.

I send you the bugfix.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-25 12:41

Message:
Logged In: YES 
user_id=6380

OK. So is socket.gethostname() better than socket.getfqdn()
or not?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-25 12:31

Message:
Logged In: YES 
user_id=35752

So much discussion for such a little issue. :-)

A misconfigured server must be part of your scenario.  It's
the only case were the hostname makes any difference.  Using
localhost.localdomain will work find on 99.99% of mail
servers.  For the remaining 0.01%, using socket.getfqdn() has
a higher chance of working than using localhost.localdomain.
If socket.getfqdn() can find a hostname that resolves
back to the IP of the client side of the connection then
it works.  Using localhost.localdomain in that case will
not work.

If socket.getfqdn() cannot find the FQDN (due to NAT,
tunnelling or whatever) things work just as well as if
localhost.localdomain was used a default.  Changing the
default to localhost.localdomain fixes nothing!

getfqdn() is a hack because it's relies on DNS. People
always screw that up. :-)

Regarding your suggested API change, I don't see how it
would help.  I doubt any code actually passes
socket.getfqdn() to SMPT.helo().

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-25 12:16

Message:
Logged In: YES 
user_id=6380

Neil: coping with a misconfigured server wasn't part of my
scenario; only coping with a client that simply doesn't have
a fqdn was. Some questions remain: (1) why can't we use
localhost.localdomain today? (2) Why is getfqdn() a hack?
(Apart from it being in the wrong module.)

Hm, I just thought of something. Why shouldn't gethostname()
be used as the default? Why bother with getfqdn() at all? At
least when gethostname() returms something inappropriate for
a particular server, it can be fixed locally by root by
fixing the hostname. (This may explain why you think
getfqdn() is a hack.)

Barry: an appropriate API could be to change the default for
local_hostname in __init__ to "localhost.localdomain" but to
leave the code that sticks in socket.getfqdn() (or maybe
just socket.gethostname()) if the value is explicitly given
as None or empty.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-25 11:10

Message:
Logged In: YES 
user_id=35752

There is no way that smtplib can automatically and reliably
find the FQDN.  socket.getfqdn() is a hack, IMHO. It doesn't
really matter though.  The chances of an email server
rejecting email based on the domain name following the HELO
verb is very small.  I recall seeing only one in actual use.

I still think the code is fine as it is.  socket.getfqdn()
aways returns something.  Most mail servers don't care what
it returns.  Changing the default to 'localhost.localdomain'
doesn't really solve anything.  In your example, the script
would still not work for the user trying to send email
through a misconfigured server.  It would reject
'localhost.localdomain' just like it rejected whatever
socket.getfqdn() returned.

The only possible arguments for using
'localhost.localdomain' are that it's faster (doesn't
require a DNS lookup) and that it gives away less
information.  It doesn't give away much information though.
The remote server already has the sender's IP address.
The hostname shouldn't mean very much.  If someone is
that paranoid they can pass 'localhost.localdomain' to
SMTP.__init__.

Eventually we should make 'localhost.localdomain' the
default.  Like I said, getfqdn() is a hack.  We could
probably make the change now and no one would care.  I'm
just being very conservative.

----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-03-25 11:00

Message:
Logged In: YES 
user_id=12800

Oh sorry.  A domain literal is something like [192.168.1.2]
IOW, the IP address octets surrounded by square brackets. 
Should be easy enough to calculate.  Attached is a proposed
patch.

As for the privacy violation, I don't think it's on the same
level as the ftp issue because we're not divulging any
information about the user.  It could be argued that leaking
the hostname might be enough to link the information to a
specific user, and I might buy that argument, although it
personally doesn't bother me too much (the IP address might
be just as sufficient for linking and  even NAT'd or DHCP'd
addresses might be static enough to guess -- witness your
own supposedly dynamic IP address :).  And the IP will
always be available via the socket peer.

OTOH, Eduardo's claim isn't totally without merit.  I'd like
to be able to retain the ability to be properly RFC
compliant, but could accept that the default be
localhost.localdomain.  If you (Guido) have a suggestion for
an appropriate API for both these requirements, that would
be great.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-25 10:23

Message:
Logged In: YES 
user_id=6380

Sorry, but what's a domain literal?

I think that it's better not to get the client involved in
getting this right; for example, someone might write a
useful tool that sends email around, and then someone else
might try to use this tool from a machine that doesn't have
a fqdn. The author might not have thought of this (rather
uncommon) situation; the user might not have enough Python
whizz to know how to fix it.

I'd like to hear also what you think of Eduardo's opinion
that sending the  fqdn is a privacy violation of the same
kind as ftplib defaulting to sending username@hostname as
the default password for anonymous login (which we did fix).
If *you* (Barry) think this is without merit, it must be
without merit. :-)

----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-03-24 23:00

Message:
Logged In: YES 
user_id=12800

Sorry to take so long to respond on this one.

RFC 2821 is the latest standard that smtplib.py should
adhere to.  Quoting:

   [HELO and EHLO] are used to identify the SMTP client to
the SMTP
   server.  The argument field contains the fully-qualified
domain name
   of the SMTP client if one is available.  In situations in
which the
   SMTP client system does not have a meaningful domain name
(e.g., when
   its address is dynamically allocated and no reverse
mapping record is
   available), the client SHOULD send an address literal
(see section
   4.1.3), optionally followed by information that will help
to identify
   the client system.

Thus, I believe that sending the FQDN is the right default,
although socket.getfqdn() should be used for portability.

Neil's patch is the correct one (although there's a typo in
the docstring, which I'll fix).  By default the fqdn is
used, but the user has the option to supply the local
hostname as an argument to the SMTP constructor.  Since RFC
2821's admonition is that the client SHOULD use a domain
literal if the fqdn isn't available, I'm happy to leave it
up to the client to get any supplied argument right.

If we wanted to be more RFC-compliant, SMTP.__init__() could
possibly check socket.getfqdn() to see if the return value
was indeed fully-qualified, and if not, craft a domain
literal for the HELO/EHLO.  Since this is a SHOULD and not a
MUST, I'm happy with the current behavior, but if you want
to provide a patch for better RFC compliance here, I'd be
happy to review it.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-24 16:51

Message:
Logged In: YES 
user_id=35752

Did you read what I wrote?

220 cranky ESMTP Postfix (Debian/GNU)
HELO localhost.localdomain
250 cranky
MAIL FROM: <nas@arctrix.com>
250 Ok
RCPT TO: <nas@arctrix.com>
DATA
450 <localhost.localdomain>: Helo command rejected: Host not
found
554 Error: no valid recipients

Bring it up again in another few years and we will change
the default.

----------------------------------------------------------------------

Comment By: Eduardo Pérez (eperez)
Date: 2002-03-24 13:39

Message:
Logged In: YES 
user_id=60347

RFC 1123 was written 11 years ago when there weren't
dial-ins, TCP tunnels, nor NATs.

This patch fix scripts that run on computers that have the
explained SMTP access, and it doesn't break any script I
know about.

Could you tell me cases were the current approach works and
the patch proposed fails?

I know the cases explained above were the current approach
doesn't work and this patch works successfully.


----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-24 10:37

Message:
Logged In: YES 
user_id=35752

I'm rejecting this patch.  RFC 1123 requires that name
sent after the HELO verb is "a valid principal host domain
name for the client host".  While RFC 1123 goes on to prohibit
HELO-based rejections it is possible that some servers do
reject mail based on HELO.  Thus, changing the hostname
sent to "localhost.localdomain" could potentially break
scripts that currently work.

The concern raised is still valid however.  Finding the
FQDN using gethostbyname() is unreliable.  To address this
concern I've added a "local_hostname" argument to the
SMTP __init__ method.  If provided it is used as the local
hostname for the HELO and EHLO verbs.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-24 07:06

Message:
Logged In: YES 
user_id=6380

Since Barry has not expressed any interest in this patch,
reassigning to Neil, and set status to Accepted.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-23 20:42

Message:
Logged In: YES 
user_id=35752

This patch looks correct in theory to me.  Trying to find
the FQDN is wrong, IMHO.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-29 21:24

Message:
Logged In: YES 
user_id=6380

Seems reasonable to me, but I lack the SMTP knowledge to
understand all the issues.  Assigned to Barry Warsaw for
review.  (Barry: Eduardo found a similar privacy violation
in ftplib, which I fixed.  You might also ask Thomas Wouters
for a review of the underlying idea.)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=497736&group_id=5470