tail
Marco Sulla
Marco.Sulla.Python at gmail.com
Sat Apr 23 17:12:29 EDT 2022
On Sat, 23 Apr 2022 at 23:00, Chris Angelico <rosuav at gmail.com> wrote:
> > > This is quite inefficient in general.
> >
> > Why inefficient? I think that readlines() will be much slower, not
> > only more time consuming.
>
> It depends on which is more costly: reading the whole file (cost
> depends on size of file) or reading chunks and splitting into lines
> (cost depends on how well you guess at chunk size). If the lines are
> all *precisely* the same number of bytes each, you can pick a chunk
> size and step backwards with near-perfect efficiency (it's still
> likely to be less efficient than reading a file forwards, on most file
> systems, but it'll be close); but if you have to guess, adjust, and
> keep going, then you lose efficiency there.
Emh, why chunks? My function simply reads byte per byte and compares it to
b"\n". When it find it, it stops and do a readline():
def tail(filepath):
"""
@author Marco Sulla
@date May 31, 2016
"""
try:
filepath.is_file
fp = str(filepath)
except AttributeError:
fp = filepath
with open(fp, "rb") as f:
size = os.stat(fp).st_size
start_pos = 0 if size - 1 < 0 else size - 1
if start_pos != 0:
f.seek(start_pos)
char = f.read(1)
if char == b"\n":
start_pos -= 1
f.seek(start_pos)
if start_pos == 0:
f.seek(start_pos)
else:
for pos in range(start_pos, -1, -1):
f.seek(pos)
char = f.read(1)
if char == b"\n":
break
return f.readline()
This is only for one line and in utf8, but it can be generalised.
More information about the Python-list
mailing list