[Tutor] Finding line number by offset value.

Peter Otten __peter__ at web.de
Mon Feb 22 04:33:35 EST 2016


Steven D'Aprano wrote:

> On Mon, Feb 22, 2016 at 01:41:42AM +0000, Alan Gauld wrote:
>> On 21/02/16 19:32, Cody West wrote:
> 
>> > I'm trying to take 48L, which I believe is the character number, and
>> > get the line number from that.

The documentation isn't explicit, but

"""
with open('/foo/bar/my_file', 'rb') as f:
  matches = rules.match(data=f.read())
"""

suggests that the library operates on bytes, not characters.

>> I'm not totally clear what you mean but, if it is that 48L
>> is the character count from the start of the file and you
>> want to know the line number then you need to count the
>> number of \n characters between the first and 48th
>> characters.
>> 
>> But thats depending on your line-end system of course,
>> there may be two characters on each EOL...
> 
> Provided your version of Python is built with "universal newline
> support", and nearly every Python is, then if you open the file in text
> mode, all end-of-lines are automatically converted to \n on reading.

Be careful, *if* the numbers are byte offsets and you open the file in 
universal newlines mode or text mode your results will be unreliable.

> If the file is small enough to read all at once, you can do this:

> offset = 48
> text = the_file.read(offset)
> print text.count('\n')

It's the offset that matters, not the file size; the first 48 bytes of a 
terabyte file will easily fit into the memory of your Apple II ;)




More information about the Tutor mailing list