[Tutor] String Attribute
Martin A. Brown
martin at linux-ip.net
Fri Jul 31 18:18:26 CEST 2015
Greetings again Hal,
Thank you for posting your small amounts of code and results inline.
Thanks for also including clear questions. Your "surface" still
seems to add extra space, so, if you could trim that, you may get
even more responses from others who are on the Tutor mailing list.
Now, on to your question.
> fname = raw_input("Enter file name: ")
> if len(fname) < 1 : fname = "mbox-short.txt"
> fh = open(fname)
> count = 0
> for line in fh:
> if not line.startswith('From'): continue
> line2 = line.strip()
> line3 = line2.split()
> line4 = line3[1]
> addresses = set()
> addresses.add(line4)
> count = count + 1
> print addresses
> print "There were", count, "lines in the file with From as the first word"
> The code produces the following out put:
>
> In [15]: %run _8_5_v_13.py
> Enter file name: mbox-short.txt
> set(['stephen.marquard at uct.ac.za'])
[ ... snip ... ]
> set(['cwen at iupui.edu'])
>
> Question no. 1: is there a build in function for set that parses
> the data for duplicates.
The problem is not with the data structure called set().
Your program is not bad at all.
I would suggest making two small changes to it.
I think I have seen a pattern in the samples of code you have been
sending--this pattern is that you reuse the same variable inside a
loop, and do not understand why you are not collecting (or
accumulating) all of the results.
Here's your program. I have moved two lines. The idea here is to initialize
the 'addresses' variable before the loop begins (exactly like you do with the
'count' variable). Then, after the loop completes (and, you have processed
all of your input and accumulated all of the desired data), you can also print
out the contents of the set variable called 'addresses'.
Try this out:
fname = raw_input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
addresses = set()
for line in fh:
if not line.startswith('From'): continue
line2 = line.strip()
line3 = line2.split()
line4 = line3[1]
addresses.add(line4)
count = count + 1
print "There were", count, "lines in the file with From as the first word"
print addresses
> Question no. 2: Why is there not a building function for append?
> Question no. 3: If all else fails, i.e., append & set, my only
> option is the slice the data set?
I do not understand these two questions.
Good luck.
-Martin
P.S. By the way, Alan Gauld has also responded to your message, with
a differently-phrased answer, but, fundamentally, he and I are
saying the same thing. Think about where you are initializing
your variables, and know that 'addresses = set()' in the middle
of the code is re-initializing the variable and throwing away
anything that was there before..
--
Martin A. Brown
http://linux-ip.net/
More information about the Tutor
mailing list