Parsing for email addresses
mattbarkan at gmail.com
Tue Feb 16 19:58:27 CET 2010
Hey all, thanks as always for the quick responses.
I actually found a very simple way to do what I needed to do. In
short, I needed to take an email which had a large number of addresses
in the 'to' field, and place just the identifiers (everything to the
left of @domain.com), in a python list.
I simply highlighted all the addresses and placed them in a text file
called emails.txt. Then I had the following code which placed each
line in the file into the list 'names':
fileHandle = open('/Users/Matt/Documents/python/results.txt','r')
names = fileHandle.readlines()
Now, the 'names' list has values looking like this: ['aaa12 at domain.com
\n', 'bbb34 at domain.com\n', etc]. So I ran the following code:
for x in names:
And that did the trick! 'Names' now has ['aaa12', 'bbb34', etc].
Obviously this only worked because all of the domain names were the
same. If they were not then based on your comments and my own
research, I would've had to use regex and the split(), which looked
massively complicated to learn.
On Feb 15, 8:01 pm, Ben Finney <ben+pyt... at benfinney.id.au> wrote:
> galileo228 <mattbar... at gmail.com> writes:
> > I'm trying to write python code that will open a textfile and find the
> > email addresses inside it. I then want the code to take just the
> > characters to the left of the "@" symbol, and place them in a list.
> Email addresses can have more than one ‘@’ character. In fact, the
> quoting rules allow the local-part to contain *any ASCII character* and
> remain valid.
> > Any suggestions would be much appeciated!
> For a brief but thorough treatment of parsing email addresses, see RFC
> 3696, “Application Techniques for Checking and Transformation of Names”
> <URL:http://www.ietf.org/rfc/rfc3696.txt>, specifically section 3.
> \ “What I have to do is see, at any rate, that I do not lend |
> `\ myself to the wrong which I condemn.” —Henry Thoreau, _Civil |
> _o__) Disobedience_ |
> Ben Finney
More information about the Python-list