[Tutor] Parsing email headers

DL Neil PyTutor at danceswithmice.info
Sun Apr 26 19:08:05 EDT 2020


On 27/04/20 9:13 AM, Jim wrote:
> What I want to do is figure out where an email came from without 
> actually opening it. We all get possible malicious emails. Some are 
> obvious but some look pretty real. Many times the From line just says 
> "Google" or "Chase", etc.  I wrote a little bare bones script that will 
> print out the From:, Return-Path: and the Sender: names from the header.
> 
> Right now using Thunderbird, I right-click on the email in question. 
> Then I click Save As and give it a name. It is then saved as a .eml 
> file. Then I give the file name to my script and see the header info.
> 
> I worry about discarding a legitimate email or getting some type 
> infection by opening an email to check if it is legitimate. So am I 
> protecting myself with the above procedure or will the above procedure 
> still subject me to risks of opening a bad email?
> 
> Right now it is a fairly manual process. If it is worth while I would 
> like to spend the time making it a one click process if possible.

Have you seen the PSL's imaplib — IMAP4 protocol client? With 
appropriate coding, this would save the manual effort by enabling direct 
access to the server.

Sadly, you may need to tangle with whatever security/encryption is used 
by the email server.

It would give the opportunity to move msgs from INBOX to a Junk folder 
or similar - rather than deleting any false-negatives, per your concern!

OT: Yes, be aware (if you are not already) that many headers can be 
"spoofed" and thus email can be made to come from one place/person but 
actually originates elsewhere, eg a spammer. So, even if the header says 
'whitehouse.gov', it may not be true!


Notes:
- suggested IMAP rather than POP because you might want to [re]move 'the 
bad stuff' but not the 'good'. This further implies that either you are 
using IMAP with Thunderbird, or that the 'filter program' would have to 
be run before Thunderbird is started (and Tb's regular 'Get messages' 
function switched off.

IMAP stores msg on the server

POP downloads msgs and stores them 'in Thunderbird' on the client

PSL = Python Standard Library


WebRef:
https://docs.python.org/3.8/library/imaplib.html
-- 
Regards =dn


More information about the Tutor mailing list