[Tutor] Reading from files problem
Scott SA
pydev at rscorp.ab.ca
Mon Apr 20 11:03:09 CEST 2009
On Apr 20, 2009, at 12:59 AM, Alan Gauld wrote:
> You might want to store the data in a dictionary keyed by ID number?
I had thought of suggesting this, but it appeared that the OP was
going to re-iterate the file each time he wished to query the CSV.
May have been a bad assumption on my part as I envisioned pickling a
dict. and that just got too complicated.
> test = [float(n) for n in lines[11:14]]
> hwgrades = sum(hw)
The composite of this would be:
sum([float(n) for n in lines[11:14]])
... which, I agree, is easier on the eyes/brain than the
reduce(lambda:...) example I gave.
sum is also on <http://docs.python.org/library/functions.html> along
with with range and other built-ins.
Chris: wrapping the for-loop in square-brackets is part of list
comprehension, found here (part 5.1.4)
<http://docs.python.org/tutorial/datastructures.html>
> Thats all fine for reading one stiudent, but you overwrite the data
> each time through the loop! This also looks like an obvious use for
> a class so I'd create a Student class to hold all the data
> (You could create methods to do the totals/averages too, plus add a
> __str__ method to print the student data in the format required-
> I'll leave that as an excercise for the reader!))
This part is actually the reason I've replied, everything before this
was 'just along the way'.
Classes are a big subject for starting out, here are the main docs.
<http://docs.python.org/tutorial/classes.html>
Also, check out 'dive into python' and others for help in getting a
handle on that.
I figured that the Student class proposed probably needed an example
to get over the initial hurdle.
class Student(object):
def __init__(self)
pass
In its most basic form, this is pretty much the 'hello world' for
classes.
> So I'd change the structure to be like this(pseudo code)
>
> students = dict() # empty dict
> for line in gradesfile:
> line = line.split(',')
> s = Student()
This step creates an instance of the class. Just for the moment, think
of it as a fancy variable -- how python will store and reference the
live data. In the end, you would need a class-instance for each and
every student (line of the file).
> s.id = line[0]
And this adds an 'id' attribute to the class
Pre-defined in the class, this would look like:
class Student(object):
def __init__(self)
self.id = None
When the instance is created, the id has None as its value (or
anything you wanted). The "self" reference means the instance of the
class itself, more on that in a moment.
Still accessed the same as above:
s.id = n
> s.lastname = line[1]
> etc....
> s.hwtotal = sum(hw)
> etc....
> students[s.id] = s
As mentioned, id, lastname, hwtotal, etc. become attributes of the
class. Nothing terribly magical, they are actually stored in a
dictionary (i.e. s.__dict__) and long-hand access would be:
s.__dict__['id']
So, the next step to this would be to actually use the class to do the
heavy lifting. This is what Alan is talking about a bit further down.
class Student(object):
def __init__(self, csv_data):
csv_list = csv_data.split(',')
self.id = csv_list[0]
...
self. hwgrades = self._listFloats(csv_list[4:10])
def _list_floats(self, str_list):
return [float(n) for n in str_list]
def hw_grade_total(self):
sum(self.hwgrades)
The two methods are part of the business-logic of the class - notice
internally they are accessed by 'self'. This is very important, so
python knows what data to work with.
Assuming you're not using the CSV library or already have the row/line
from the file as a list:
for student_data in grades_file:
s = Student(student_data)
student_dict[s.id] = s
So, when python creates the class instance, it calls the __init__
method. Since you've passed it a list of student data, it processes it
at the same time. In this example, it will cause an error if you don't
pass any data, by the way. You might need to consider verifying that
each line has the correct number of fields otherwise an error could be
generated.
Accessing the grade total method is like this:
grade_total = s.hw_grade_total()
Or, if you want the raw list of floats:
grade_list = s.hwgrades
I still contend that all of this is ideal for a database, like SQLite,
which would allow for searching by name as well as ID, etc. It is the
persistence of data that motivates this perspective. So what I would
do is make a class Students, with a file-import method using the CSV
lib which then did the work of putting all the data into a database,
bypassing a Student class (until there was a valid reason for one).
Once the data is loaded in, it can be referenced without re-
interpreting the CSV file, again through methods in a Students class.
I hope this helps,
Scott
PS. My email is acting up, did my prev. message actually make it to
the list?
More information about the Tutor
mailing list