[Tutor] deriving class from file to handle input line numbers?

Kent Johnson kent37 at tds.net
Tue Aug 16 12:56:47 CEST 2005

Duncan Gibson wrote:
> I was sure that this must be a frequently asked [homework?] question,
> but trying to search for 'file open line number' didn't throw up the
> sort of answers I was looking for, so here goes...
> I regularly write parsers for simple input files, and I need to give
> helpful error messages if the input is invalid. The three most helpful
> pieces on information are file name, line number and line content. Of
> course I can use a global variable to track how many times f.readline()
> is called, but I was wondering whether there is a more OO or Python
> way of encapsulating this within a class derived from file.

If you just want to keep track of line numbers as you read the file by lines, you could use enumerate():

f = open('myfile.txt')
for line_number, line in enumerate(f):

For any iterable object, enumerate() returns a sequence of (index, value) for each value in the object.

Your class has the advantage of maintaining the line number internally and customizing readline() to your needs.

More comments inline.


> What I have below is the minimal interface that I could come up with,
> but it is a file adaptor rather than a derived class, and it doesn't
> seem quite clean to me, because I have to open the file externally
> and then pass the file object into the constructor.

Why? You should be able to open the file in the constuctor and provide close() and __del__() methods to close the file. What problem did you have when deriving from file? 

> class LineCountedInputFile(object):
>     """
>     add input line count to minimal input File interface
>     The file must be opened externally, and then passed into the
>     constructor.  All access should occur through the readLine method,
>     and not using normal File methods on the external file variable,
>     otherwise things will get out of sync and strange things could
>     happen, including incorrect line number.
>     """
>     __slots__ = (
>             '_inputFile',
>             '_lineNumber')

__slots__ is overkill, it is intended as a way to save memory in classes that will have many instances, and discouraged otherwise.

>     def __init__(self, inputFile):
>         """
>         create a LineCountedInputFile adaptor object
>         :param inputFile: existing File object already open for reading only
>         :type  inputFile: `File`
>         """
>         assert isinstance(inputFile, file)
>         assert not inputFile.closed and inputFile.mode == 'r'
>         self._inputFile = inputFile
>         self._lineNumber = 0
>     #----------------------------------------------------------------------
>     def readLine(self):

I would call this method readline() to match what file does.

>         """
>         call file.readline(), strip excess whitespace, increment line number
>         :return: next line of text from file minus excess whitespace, or
>                 None at end-of-file
>         :rtype:  str
>         """
>         line = self._inputFile.readline()
>         if len(line) == 0:
>             return None
>         self._lineNumber += 1
>         return line.strip()
>     #----------------------------------------------------------------------
>     def _get_fileName(self):
>         return self._inputFile.name
>     fileName = property(_get_fileName, None, None, """(read-only)""")

Maybe just call this property 'name' to match what file does.
>     #----------------------------------------------------------------------
>     def _get_lineNumber(self):
>         return self._lineNumber
>     lineNumber = property(_get_lineNumber, None, None, """(read-only)""")

The use of properties to make read-only attributes might be overkill also. I would just have a lineNumber attribute but that is more of a judgement call.

