[Tutor] searching through a string list

Sean 'Shaleh' Perry shalehperry@attbi.com
Thu, 08 Aug 2002 18:42:53 -0700 (PDT)


On 09-Aug-2002 Mathew P. wrote:
> I have a huge list (imagine a list of pairs that is like, 12,000
> entries long). The pairs each consist of a number and a persons name,
> both in string form (a list of lists, each sublist containing the
> pair). I need to parse this list, which I can figure out how to do, but
> while parsing it, I need to be able to search for a persons name.
> 
> This list will have the same names in it more than once, and what I am
> actually doing is parsing the list to find out how many times a persons
> name appears in the list. To complicate things, I need to be able to do
> a partial match. For instance, I need to be able to find out how many
> "anthony" 's appear in the list - so if I have an anthony brown, and
> anthony johnson, and an anthony williams, the program will count three
> anthonys. 
> I
> I was sure that the string library would have search facilities that
> would do just what I wanted. I have not found exactly what I was
> looking for though. The closest thing I came to was the string.find()
> method.  Will string.find() (inside of a while or for loop) do partial
> matches for me like this? If so, can someone give me an example of how
> to use the find method, or point me to a URL? The python library docs
> have no example code that I was able to find, to illustrate how to use
> find.
> 

Unfortunately string.find will also match in the middle of words.  So if you
are looking for say all of the women named Jean it would also match Jean-Luc.

This is a problem which will take some effort on your part (regardless of the
language used).  Python's string and maybe re library will help but much of the
logic will be your own.

Just start an instance of python and play around in the interpreter -- this is
one of python's great strengths.

A common idiom is to use a dictionary to store the instances of each name along
with a count.

in simple python code:

for name in list:
  if name in known_names:
    known_name[name] += 1
  else
    known_name[name] = 1

I know this is only part of your request but it should point you in the right
direction.