[Tutor] Finding line number by offset value.
Peter Otten
__peter__ at web.de
Mon Feb 22 04:33:35 EST 2016
Steven D'Aprano wrote:
> On Mon, Feb 22, 2016 at 01:41:42AM +0000, Alan Gauld wrote:
>> On 21/02/16 19:32, Cody West wrote:
>
>> > I'm trying to take 48L, which I believe is the character number, and
>> > get the line number from that.
The documentation isn't explicit, but
"""
with open('/foo/bar/my_file', 'rb') as f:
matches = rules.match(data=f.read())
"""
suggests that the library operates on bytes, not characters.
>> I'm not totally clear what you mean but, if it is that 48L
>> is the character count from the start of the file and you
>> want to know the line number then you need to count the
>> number of \n characters between the first and 48th
>> characters.
>>
>> But thats depending on your line-end system of course,
>> there may be two characters on each EOL...
>
> Provided your version of Python is built with "universal newline
> support", and nearly every Python is, then if you open the file in text
> mode, all end-of-lines are automatically converted to \n on reading.
Be careful, *if* the numbers are byte offsets and you open the file in
universal newlines mode or text mode your results will be unreliable.
> If the file is small enough to read all at once, you can do this:
> offset = 48
> text = the_file.read(offset)
> print text.count('\n')
It's the offset that matters, not the file size; the first 48 bytes of a
terabyte file will easily fit into the memory of your Apple II ;)
More information about the Tutor
mailing list