[Tutor] design question -- nested loops considered harmful?
Liam Clarke
cyresse at gmail.com
Tue Nov 30 04:27:35 CET 2004
Hi Brian,
Just curious -
>date_flag = '[Date]'
> email_flag = '[Email]'
> item_flags = [date_flag, email_flag]
>def parse_file(list_of_lines):
> data_dict = {}
> for line in list_of_lines:
> for item in item_flags:
> if line.startswith(item):
> data_dict[item] = line[len(item):]
> break
> return data_dict
So, you're thingmajig scans each line for either date_flag or
email_flag, and then slices the rest of the line and saves it as a
dict. And, I guess you're looking for one date_flag line or one
email_flag...
Assuming that there would only be one occurrence of each one in a file -
x=file('Brian's source file') 'r')
a=x.readlines() #How big is it? If it's a huge file, this may not be the best
x.close()
a="".join(a) #Turns a list into a string, the "" is the joiner, i.e.
# a=["Hello","World"] would become "HelloWorld"
whereas a="?M?".join(a)
#would become "Hello?M?World"
#Just had a thought - a=str(a) would do exactly the
same as a="".join(a)
#wouldn't it?
item_flags=["[Date]","[Email]"]
def parsefile(fileString):
data_dict={}
for item in item_flags:
tagIndex=fileString.find(item) #Finds lowest occurrence of
item. Index is would be
#first letter of
item_flag[item]
newLineIndex=fileString.find('\n', tagIndex) #Finds next
newline after item_flag[item]
data_dict[item]=fileString[tagIndex+len(item):newLineIndex]
The slice at the end should slice from the char after the ']' of item,
to the char before the next '\n' which is the end of the line.
If there are multiple occurrences, all you have to do is -
for item in item_flags:
foundIndice=[]
findIndex=0
startIndex=0
while findIndex ! = -1:
findIndex=string2FindIn.find(item, startIndex)
foundIndice.append(findIndex)
del foundIndice[len(foundIndice)-1] #Delete last item, as .find
returns "-1" for string not
#found, and this
will always be appended at end.
data_dict[item]=foundIndice
Of course, this is entirely subjective on personal style.
On Mon, 29 Nov 2004 21:26:37 -0500, Brian van den Broek
<bvande at po-box.mcgill.ca> wrote:
> Hi all,
>
> in a recent post in the "comapring lists" thread, Danny Yoo wrote:
>
> > whenever I see nested loops
> > like this, I get nervous. *grin*
>
> This got me thinking about general design issues. In various programs I
> have made much use of nested loops in order to parse data files. I've
> done this in cases where I am interested in pulling out some data which
> is identified by a delimiter. Below is a minimal example of the sort of
> thing I have been doing:
>
> date_flag = '[Date]'
> email_flag = '[Email]'
> item_flags = [date_flag, email_flag]
>
> def parse_file(list_of_lines):
> data_dict = {}
> for line in list_of_lines:
> for item in item_flags:
> if line.startswith(item):
> data_dict[item] = line[len(item):]
> break
> return data_dict
>
> In this particular toy case, the "for i in item_flags" isn't too much
> help, as I've only listed two delimiters. But I often have a good many
> more, and thus thought the nested loop much better than a long list of
> if-tests. Since the logic in the "for item in item_flags:" loop is quite
> small, it never occurred to be to move it into its own function.
>
> I think I see that the nervous-making aspect of nested loops comes from
> concern about clarity of control flow. (Is there some other worry I'm
> missing?) But is my sort of case one which shows a rule of thumb isn't a
> rigid law? Is there a much better design for my task that I've missed?
> Do more experience folk doubt my wisdom in taking the embedded loop to
> be too short to bother factoring out?
>
> Thanks for any input. Best to all,
>
> Brian vdB
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
--
'There is only one basic human right, and that is to do as you damn well please.
And with it comes the only bsi c human duty, to take the consequences.
More information about the Tutor
mailing list