[Tutor] String Attribute

ltc.hotspot at gmail.com ltc.hotspot at gmail.com
Fri Jul 31 20:57:56 CEST 2015


Hi Martin,




Hal is not have a great day, indeed to day:



Here is the raw data entered:


fname = raw_input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
addresses = set()
for line in fh:
 line2 = line.strip()
 line3 = line2.split()
 line4 = line3[0]
 addresses.add(line4)
 count = count + 1
print "There were", count, "lines in the file with From as the first word"
print addresses





→Question:  Why is the list index out of range on line # 9:




IndexError                               




 Traceback (most recent call last)
C:\Users\vm\Desktop\apps\docs\Python\assinment_8_5_v_20.py in <module>()
      7         line2 = line.strip()
      8         line3 = line2.split()
----> 9         line4 = line3[1]
     10         addresses.add(line4)
     11         count = count + 1




IndexError: list index out of range



​


→I entered different index ranges from  [] to [5] that, later,  produced the same Index Error message:


IndexError: list index out of range


In [34]: print line3[]
  File "<ipython-input-34-7bf39294000a>", line 1
    print line3[]
                ^
SyntaxError: invalid syntax



In [35]: print line[1]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-35-3ba0fe1b7bd4> in <module>()
----> 1 print line[1]


IndexError: string index out of range


In [36]: print line[2]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-36-6088e93feeeb> in <module>()
----> 1 print line[2]


IndexError: string index out of range


In [37]: print line[3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-37-127d944ba1b7> in <module>()
----> 1 print line[3]


IndexError: string index out of range


In [38]: print line[4]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-38-5c497e1246ea> in <module>()
----> 1 print line[4]


IndexError: string index out of range


In [39]: print line[5]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-39-3a91a0cf6bd2> in <module>()
----> 1 print line[5]


IndexError: string index out of range


→Question: I think the problem is in the placement of the address set: The addresses = set()?


Regards,

Hal








Sent from Surface





From: Martin A. Brown
Sent: ‎Friday‎, ‎July‎ ‎31‎, ‎2015 ‎9‎:‎18‎ ‎AM
To: ltc.hotspot at gmail.com
Cc: Tutor at python.org






Greetings again Hal,

Thank you for posting your small amounts of code and results inline. 
Thanks for also including clear questions.  Your "surface" still 
seems to add extra space, so, if you could trim that, you may get 
even more responses from others who are on the Tutor mailing list.

Now, on to your question.

> fname = raw_input("Enter file name: ")
> if len(fname) < 1 : fname = "mbox-short.txt"
> fh = open(fname)
> count = 0
> for line in fh:
>    if not line.startswith('From'): continue
>    line2 = line.strip()
>    line3 = line2.split()
>    line4 = line3[1]
>    addresses = set()
>    addresses.add(line4)
>    count = count + 1
>    print addresses
> print "There were", count, "lines in the file with From as the first word"

> The code produces the following out put:
>
> In [15]: %run _8_5_v_13.py
> Enter file name: mbox-short.txt
> set(['stephen.marquard at uct.ac.za'])

   [ ... snip ... ]

> set(['cwen at iupui.edu'])
>
> Question no. 1: is there a build in function for set that parses 
> the data for duplicates.

The problem is not with the data structure called set().

Your program is not bad at all.

I would suggest making two small changes to it.

I think I have seen a pattern in the samples of code you have been 
sending--this pattern is that you reuse the same variable inside a 
loop, and do not understand why you are not collecting (or 
accumulating) all of the results.

Here's your program.  I have moved two lines.  The idea here is to initialize
the 'addresses' variable before the loop begins (exactly like you do with the
'count' variable).  Then, after the loop completes (and, you have processed
all of your input and accumulated all of the desired data), you can also print
out the contents of the set variable called 'addresses'.

Try this out:

   fname = raw_input("Enter file name: ")
   if len(fname) < 1 : fname = "mbox-short.txt"
   fh = open(fname)
   count = 0
   addresses = set()
   for line in fh:
      if not line.startswith('From'): continue
      line2 = line.strip()
      line3 = line2.split()
      line4 = line3[1]
      addresses.add(line4)
      count = count + 1
   print "There were", count, "lines in the file with From as the first word"
   print addresses


> Question no. 2: Why is there not a building function for append?


>> Alan answered the question, thanks

> Question no. 3: If all else fails, i.e., append & set, my only 
> option is the slice the data set?

I do not understand these two questions.


>> Alan answered the question thanks

Good luck.

-Martin

P.S. By the way, Alan Gauld has also responded to your message, with
   a differently-phrased answer, but, fundamentally, he and I are
   saying the same thing.  Think about where you are initializing
   your variables, and know that 'addresses = set()' in the middle
   of the code is re-initializing the variable and throwing away
   anything that was there before..

-- 
Martin A. Brown
http://linux-ip.net/


More information about the Tutor mailing list