RE Module Performance
Chris Angelico
rosuav at gmail.com
Tue Jul 30 10:45:57 EDT 2013
On Tue, Jul 30, 2013 at 3:01 PM, <wxjmfauth at gmail.com> wrote:
> I am pretty sure that once you have typed your 127504
> ascii characters, you are very happy the buffer of your
> editor does not waste time in reencoding the buffer as
> soon as you enter an €, the 125505th char. Sorry, I wanted
> to say z instead of euro, just to show that backspacing the
> last char and reentering a new char implies twice a reencoding.
You're still thinking that the editor's buffer is a Python string. As
I've shown earlier, this is a really bad idea, and that has nothing to
do with FSR/PEP 393. An immutable string is *horribly* inefficient at
this; if you want to keep concatenating onto a string, the recommended
method is a list of strings that gets join()d at the end, and the same
technique works well here. Here's a little demo class that could make
the basis for such a system:
class EditorBuffer:
def __init__(self,fn):
self.fn=fn
self.buffer=[open(fn).read()]
def insert(self,pos,char):
if pos==0:
# Special case: insertion at beginning of buffer
if len(self.buffer[0])>1024: self.buffer.insert(0,char)
else: self.buffer[0]=char+self.buffer[0]
return
for idx,part in enumerate(self.buffer):
l=len(part)
if pos>l:
pos-=l
continue
if pos<l:
# Cursor is somewhere inside this string
splitme=self.buffer[idx]
self.buffer[idx:idx+1]=splitme[:pos],splitme[pos:]
l=pos
# Cursor is now at the end of this string
if l>1024: self.buffer[idx:idx+1]=self.buffer[idx],char
else: self.buffer[idx]+=char
return
raise ValueError("Cannot insert past end of buffer")
def __str__(self):
return ''.join(self.buffer)
def save(self):
open(fn,"w").write(str(self))
It guarantees that inserts will never need to resize more than 1KB of
text. As a real basis for an editor, it still sucks, but it's purely
to prove this one point.
ChrisA
More information about the Python-list
mailing list