Reading a file into a data structure....
MrPink
tdsimpson at gmail.com
Sat Oct 15 22:18:18 EDT 2011
I did not understand what a tuple was.
So it was very hard for me to understand what a namedtuple was and
follow the code.
Then I looked up the word at dictionary.com and got this:
http://dictionary.reference.com/browse/tuple
tuple: computing a row of values in a relational database
Now I understand a little better what a tuple is and can follow the
code better.
A namedtuple seems like a dictionary type. I'll need to read up on
the difference between the two.
Thanks again.
On Oct 15, 12:47 am, Chris Rebert <c... at rebertia.com> wrote:
> On Fri, Oct 14, 2011 at 7:59 PM,MrPink<tdsimp... at gmail.com> wrote:
> > This is what I have been able to accomplish:
>
> > def isInt(s):
> > try:
> > i = int(s)
> > return True
> > except ValueError:
> > return False
>
> > f = open("powerball.txt", "r")
> > lines = f.readlines()
> > f.close()
>
> > dDrawings = {}
> > for line in lines:
> > if isInt(line[0]):
> > t = line.split()
> > d = t[0]
> > month,day,year = t[0].split("/")
> > i = int(year + month + day)
> > wb = t[1:6]
> > wb.sort()
> > pb = t[6]
> > r = {'d':d,'wb':wb,'pb':pb}
> > dDrawings[i] = r
>
> > The dictionary dDrawings contains records like this:
> > dDrawings[19971101]
> > {'pb': '20', 'd': '11/01/1997', 'wb': ['22', '25', '28', '33', '37']}
>
> > I am now able to search for ticket in a date range.
> > keys = dDrawings.keys()
> > b = [key for key in keys if 20110909 <= key <= 20111212]
>
> > How would I search for matching wb (White Balls) in the drawings?
>
> > Is there a better way to organize the data so that it will be flexible
> > enough for different types of searches?
> > Search by date range, search by pb, search by wb matches, etc.
>
> > I hope this all makes sense.
>
> from datetime import datetime
> from collections import namedtuple, defaultdict
> # for efficient searching by date: import bisect
>
> DATE_FORMAT = "%m/%d/%Y"
> Ticket = namedtuple('Ticket', "white_balls powerball date".split())
>
> powerball2ticket = defaultdict(set)
> whiteball2ticket = defaultdict(set)
> tickets_by_date = []
>
> with open("powerball.txt", "r") as f:
> for line in f:
> if not line[0].isdigit():
> # what are these other lines anyway?
> continue # skip such lines
>
> fields = line.split()
>
> date = datetime.strptime(fields[0], DATE_FORMAT).date()
> white_balls = frozenset(int(num_str) for num_str in fields[1:6])
> powerball = int(fields[6])
> ticket = Ticket(white_balls, powerball, date)
>
> powerball2ticket[powerball].add(ticket)
> for ball in white_balls:
> whiteball2ticket[ball].add(ticket)
> tickets_by_date.append(ticket)
>
> tickets_by_date.sort(key=lambda ticket: ticket.date)
>
> print(powerball2ticket[7]) # all tickets with a 7 powerball
> print(whiteball2ticket[3]) # all tickets with a non-power 3 ball
>
> Cheers,
> Chris
> --http://rebertia.com
More information about the Python-list
mailing list