Reading a file into a data structure....
Troy S
tdsimpson at gmail.com
Sat Oct 15 21:48:57 EDT 2011
Chris,
Thanks for the help.
I am using the powerball numbers from this text file downloaded from the site.
http://www.powerball.com/powerball/winnums-text.txt
The first row is the header/fieldnames and the file starts off like this:
Draw Date WB1 WB2 WB3 WB4 WB5 PB PP
10/12/2011 43 10 12 23 47 18 3
10/08/2011 35 03 37 27 45 31 5
10/05/2011 46 07 43 54 20 17 4
10/01/2011 27 43 12 23 01 31 3
09/28/2011 41 51 30 50 53 08 2
09/24/2011 27 12 03 04 44 26 5
09/21/2011 47 52 55 48 12 13 4
The testing of a digit was used to skip the first row only.
I'm stil dissecting your Python code to better understand the use of
collection, namedtuples, etc.
I have not found many examples/descriptions yet about collections,
namedtuples, etc. I don't quite understand them that much. Do you
know of a reference that can break this stuff down better for me?
The couple of books that I have on Python do not go into collection,
namedtuples, etc that much.
Thanks,
On Sat, Oct 15, 2011 at 12:47 AM, Chris Rebert <clp2 at rebertia.com> wrote:
> On Fri, Oct 14, 2011 at 7:59 PM, MrPink <tdsimpson at gmail.com> wrote:
>> This is what I have been able to accomplish:
>>
>> def isInt(s):
>> try:
>> i = int(s)
>> return True
>> except ValueError:
>> return False
>>
>> f = open("powerball.txt", "r")
>> lines = f.readlines()
>> f.close()
>>
>> dDrawings = {}
>> for line in lines:
>> if isInt(line[0]):
>> t = line.split()
>> d = t[0]
>> month,day,year = t[0].split("/")
>> i = int(year + month + day)
>> wb = t[1:6]
>> wb.sort()
>> pb = t[6]
>> r = {'d':d,'wb':wb,'pb':pb}
>> dDrawings[i] = r
>>
>> The dictionary dDrawings contains records like this:
>> dDrawings[19971101]
>> {'pb': '20', 'd': '11/01/1997', 'wb': ['22', '25', '28', '33', '37']}
>>
>> I am now able to search for ticket in a date range.
>> keys = dDrawings.keys()
>> b = [key for key in keys if 20110909 <= key <= 20111212]
>>
>> How would I search for matching wb (White Balls) in the drawings?
>>
>> Is there a better way to organize the data so that it will be flexible
>> enough for different types of searches?
>> Search by date range, search by pb, search by wb matches, etc.
>>
>> I hope this all makes sense.
>
> from datetime import datetime
> from collections import namedtuple, defaultdict
> # for efficient searching by date: import bisect
>
> DATE_FORMAT = "%m/%d/%Y"
> Ticket = namedtuple('Ticket', "white_balls powerball date".split())
>
> powerball2ticket = defaultdict(set)
> whiteball2ticket = defaultdict(set)
> tickets_by_date = []
>
> with open("powerball.txt", "r") as f:
> for line in f:
> if not line[0].isdigit():
> # what are these other lines anyway?
> continue # skip such lines
>
> fields = line.split()
>
> date = datetime.strptime(fields[0], DATE_FORMAT).date()
> white_balls = frozenset(int(num_str) for num_str in fields[1:6])
> powerball = int(fields[6])
> ticket = Ticket(white_balls, powerball, date)
>
> powerball2ticket[powerball].add(ticket)
> for ball in white_balls:
> whiteball2ticket[ball].add(ticket)
> tickets_by_date.append(ticket)
>
> tickets_by_date.sort(key=lambda ticket: ticket.date)
>
> print(powerball2ticket[7]) # all tickets with a 7 powerball
> print(whiteball2ticket[3]) # all tickets with a non-power 3 ball
>
>
> Cheers,
> Chris
> --
> http://rebertia.com
>
--
Troy S
More information about the Python-list
mailing list