[Tutor] Iterating over a long list with regular expressions and changing each item?

spir denis.spir at free.fr
Mon May 4 16:52:00 CEST 2009


Le Mon, 4 May 2009 10:15:35 -0400,
Dan Liang <danliang20 at gmail.com> s'exprima ainsi:

> Hi Spir and tutors,
> 
> Thank you Spir for your response. I went ahead and tried your code after
> adding a couple of dictionary entries, as below:
> -----------Code Begins---------------
> #!usr/bin/python
> 
> tags = {
> 
> 
>  'case_def_gen':['case_def','gen','null'],
>  'nsuff_fem_pl':['nsuff','null', 'null'],
>  'abbrev': ['abbrev, null, null'],
>  'adj': ['adj, null, null'],
>  'adv': ['adv, null, null'],} # tag dict
> TAB = '\t'
> 
> def newlyTaggedWord(line):
>        (word,tag) = line.split(TAB)    # separate parts of line, keeping
> data only
>        new_tags = tags['tag']     # read in dict--Index by string
> 
>        tagging = TAB.join(new_tags)    # join with TABs
>        return word + TAB + tagging     # formatted result
> 
> def replaceTagging(source_name, target_name):
>        source_file = file(source_name, 'r')
>        source = source_file.read()       # not really necessary
>        target_file = open(target_name, "w")
>        # replacement loop
>        for line in source:
>                new_line = newlyTaggedWord(line) + '\n'
>                target_file.write(new_line)
>        source_file.close()
>        target_file.close()
> 
> if __name__ == "__main__":
>        source_name = sys.argv[1]
>        target_name = sys.argv[2]
>        replaceTagging(source_name, target_name)
> 
> -----------Code Ends---------------
> 
> The file I am working on looks like this:
> 
> 
>   word      \t     case_def_gen
>   word      \t     nsuff_fem_pl
>   word      \t     adj
>   word      \t     abbrev
>   word      \t     adv
> 
> I get the following error when I try to run it, and I cannot figure out
> where the problem lies:
> 
> -----------Error Begins---------------
> 
> Traceback (most recent call last):
>   File "tag.formatter.py", line 36, in ?
>     replaceTagging(source_name, target_name)
>   File "tag.formatter.py", line 28, in replaceTagging
>     new_line = newlyTaggedWord(line) + '\n'
>   File "tag.formatter.py", line 16, in newlyTaggedWord
>     (word,tag) = line.split(TAB)    # separate parts of line, keeping data
> only
> ValueError: unpack list of wrong size
> 
> -----------Error Ends---------------
> 
> Any ideas?
> 
> Thank you!
> 
> --dan

Good that I mentioned "untested" ;-)
Can you decipher the error message? What can you reason or guess from it?
Where, how, why does an error happen? What kind of test could you perform to better point to a proper diagnosis?
I ask all of that because you do not explain us what reflexions and/or trials you did to solve the issue yourself -- instead you just write "Any ideas?".

Denis
------
la vita e estrany


More information about the Tutor mailing list