Reading a file into a data structure....

MrPink tdsimpson at gmail.com
Sat Oct 15 22:18:18 EDT 2011


I did not understand what a tuple was.
So it was very hard for me to understand what a namedtuple was and
follow the code.
Then I looked up the word at dictionary.com and got this:
http://dictionary.reference.com/browse/tuple

tuple: 	computing  a row of values in a relational database

Now I understand a little better what a tuple is and can follow the
code better.

A namedtuple seems like a dictionary type.  I'll need to read up on
the difference between the two.

Thanks again.

On Oct 15, 12:47 am, Chris Rebert <c... at rebertia.com> wrote:
> On Fri, Oct 14, 2011 at 7:59 PM,MrPink<tdsimp... at gmail.com> wrote:
> > This is what I have been able to accomplish:
>
> > def isInt(s):
> >    try:
> >        i = int(s)
> >        return True
> >    except ValueError:
> >        return False
>
> > f = open("powerball.txt", "r")
> > lines = f.readlines()
> > f.close()
>
> > dDrawings = {}
> > for line in lines:
> >    if isInt(line[0]):
> >        t = line.split()
> >        d = t[0]
> >        month,day,year = t[0].split("/")
> >        i = int(year + month + day)
> >        wb = t[1:6]
> >        wb.sort()
> >        pb = t[6]
> >        r = {'d':d,'wb':wb,'pb':pb}
> >        dDrawings[i] = r
>
> > The dictionary dDrawings contains records like this:
> > dDrawings[19971101]
> > {'pb': '20', 'd': '11/01/1997', 'wb': ['22', '25', '28', '33', '37']}
>
> > I am now able to search for ticket in a date range.
> > keys = dDrawings.keys()
> > b = [key for key in keys if 20110909 <= key <= 20111212]
>
> > How would I search for matching wb (White Balls) in the drawings?
>
> > Is there a better way to organize the data so that it will be flexible
> > enough for different types of searches?
> > Search by date range, search by pb, search by wb matches, etc.
>
> > I hope this all makes sense.
>
> from datetime import datetime
> from collections import namedtuple, defaultdict
> # for efficient searching by date: import bisect
>
> DATE_FORMAT = "%m/%d/%Y"
> Ticket = namedtuple('Ticket', "white_balls powerball date".split())
>
> powerball2ticket = defaultdict(set)
> whiteball2ticket = defaultdict(set)
> tickets_by_date = []
>
> with open("powerball.txt", "r") as f:
>     for line in f:
>         if not line[0].isdigit():
>             # what are these other lines anyway?
>             continue # skip such lines
>
>         fields = line.split()
>
>         date = datetime.strptime(fields[0], DATE_FORMAT).date()
>         white_balls = frozenset(int(num_str) for num_str in fields[1:6])
>         powerball = int(fields[6])
>         ticket = Ticket(white_balls, powerball, date)
>
>         powerball2ticket[powerball].add(ticket)
>         for ball in white_balls:
>             whiteball2ticket[ball].add(ticket)
>         tickets_by_date.append(ticket)
>
> tickets_by_date.sort(key=lambda ticket: ticket.date)
>
> print(powerball2ticket[7]) # all tickets with a 7 powerball
> print(whiteball2ticket[3]) # all tickets with a non-power 3 ball
>
> Cheers,
> Chris
> --http://rebertia.com




More information about the Python-list mailing list