yEnc implementation in Python, bit slow
Bengt Richter
bokr at oz.net
Wed Aug 6 04:34:39 EDT 2003
On 5 Aug 2003 14:37:55 +1000, Freddie <oinkfreddie at oinkshlick.oinknet> wrote:
>Freddie <oinkfreddie at oinkshlick.oinknet> wrote in
>news:Xns93CE8D81747C5freddiethescaryeleph at 218.100.3.9:
>
>Arr. There's an error here, the [n+i+256+1] shouldn't have a 1. I always get
>that wrong :) The posted files actually decode now, and the yEncode()
>overhead is a lot lower.
>
><snip>
>
>> encoded = []
>> n = 0
>> for i in range(0, len(translated), 256):
>> chunk = translated[n+i:n+i+256]
>> if chunk[-1] == '=':
>> chunk += translated[n+i+256] <<< this line
>> n += 1
>> encoded.append(chunk)
>> encoded.append('\n')
>
>--
>Remove the oinks!
Ok, I was't going to get too involved, but I thought first to try something different,
using a unicode translation table in the attempt to get all the translations in one sweep,
including the = escapes. It works (after hacking a while re encoding stuff ;-/ ), but it's
not that fast. But for small input strings it is faster than yours above, due to some things
which you can see I changed for yEncode2x below. That generally was fastest. I noticed big
differences according to input string length, so I decided to parameterize the testing,
to control the number of loops and the input string length by a steppable factor times
the 256-byte string you defined. Kind of got carried away with histograms to show throughput
comparisons, and ratios etc.
As you might expect, with large inputs, most of the time is spent in the translate routines,
and the setup and other overhead gets asymptoted out (verbing weirds language). The unicode
version's apparent throughput settles relatively early to something fairly constant. I thought
it could possibly beat the other versions because of the multiple replace passes, but not so.
Not sure where the major slowdown is.
On my old 300mhz P2 w/320MB ram and NT4 and Python 2.3, for your 256*1562 test strings,
I get:
[ 1:22] C:\pywk\clp>testyenc.py 10
Ratios of total test times over version 2x:
yEncode2: 1.888619 ratio/2x: 1.008296
yEncode3: 4.692256 ratio/2x: 2.505102
yEncode2x: 1.873080 ratio/2x: 1.000000
I.e., the average is practically identical for 2 and 2x, whereas the unicode (3)
is 2.5 times slower. Now for a minimal input (1 block of 256 bytes input):
[ 1:25] C:\pywk\clp>testyenc.py 10 1
Ratios of total test times over version 2x:
yEncode2: 0.019924 ratio/2x: 11.664868
yEncode3: 0.003890 ratio/2x: 2.277723
yEncode2x: 0.001708 ratio/2x: 1.000000
2x is over ten times faster than 2! And not quite as much faster than the unicode.
Comparing thoughput as ratios of how many times faster x2 is than 2 or 3
for 1-15 x 256 bytes input, 10 loops:
[ 1:25] C:\pywk\clp>testyenc.py -trh 10 1 15
1 13.254171 x/2: 222222222222222222222222222222222222222222222222222222222222222222
1 2.377331 x/3: 33333333333
2 8.490966 x/2: 222222222222222222222222222222222222222222
2 2.816887 x/3: 33333333333333
3 6.828756 x/2: 2222222222222222222222222222222222
3 2.932418 x/3: 33333333333333
4 5.663449 x/2: 2222222222222222222222222222
4 2.985414 x/3: 33333333333333
5 4.859705 x/2: 222222222222222222222222
5 3.008702 x/3: 333333333333333
6 4.408558 x/2: 2222222222222222222222
6 3.136186 x/3: 333333333333333
7 4.052319 x/2: 22222222222222222222
7 3.181253 x/3: 333333333333333
8 3.694041 x/2: 222222222222222222
8 3.220882 x/3: 3333333333333333
9 3.462013 x/2: 22222222222222222
9 3.271515 x/3: 3333333333333333
10 3.263672 x/2: 2222222222222222
10 3.283939 x/3: 3333333333333333
11 3.082912 x/2: 222222222222222
11 3.298241 x/3: 3333333333333333
12 2.951584 x/2: 22222222222222
12 3.319393 x/3: 3333333333333333
13 2.799223 x/2: 2222222222222
13 3.486121 x/3: 33333333333333333
14 2.653915 x/2: 2222222222222
14 3.383072 x/3: 3333333333333333
Now just kb/sec throughput, for input sizes of factor*256, steppin factor from 1 to 1562 by 100:
Oops, has to hand-adjust for 4 digits in factor (format had 3) :-/
I guess I'd recommend version yEncode2x ;-)
[ 1:30] C:\pywk\clp>testyenc.py -tkh 10 1 1562 100
1 1466.546 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1 124.638 : 22
1 640.256 : 333333333333
101 2775.813 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
101 2142.211 : 222222222222222222222222222222222222222222
101 913.269 : 333333333333333333
201 2699.580 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
201 2307.571 : 2222222222222222222222222222222222222222222222
201 901.330 : 333333333333333333
301 2719.819 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
301 2417.963 : 222222222222222222222222222222222222222222222222
301 898.533 : 33333333333333333
401 2469.232 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
401 2459.443 : 2222222222222222222222222222222222222222222222222
401 899.486 : 33333333333333333
501 2493.177 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
501 2326.776 : 2222222222222222222222222222222222222222222222
501 824.358 : 3333333333333333
601 2519.179 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
601 2177.450 : 2222222222222222222222222222222222222222222
601 867.528 : 33333333333333333
701 2485.202 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
701 2188.600 : 2222222222222222222222222222222222222222222
701 840.751 : 3333333333333333
801 2383.141 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
801 2283.640 : 222222222222222222222222222222222222222222222
801 853.356 : 33333333333333333
901 2462.079 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
901 2140.306 : 222222222222222222222222222222222222222222
901 837.844 : 3333333333333333
1001 2326.644 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1001 2167.332 : 2222222222222222222222222222222222222222222
1001 846.462 : 3333333333333333
1101 2154.616 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1101 2031.622 : 2222222222222222222222222222222222222222
1101 841.033 : 3333333333333333
1201 2015.904 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1201 2008.703 : 2222222222222222222222222222222222222222
1201 830.437 : 3333333333333333
1301 2102.667 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1301 2048.542 : 2222222222222222222222222222222222222222
1301 830.532 : 3333333333333333
1401 2117.665 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1401 2036.031 : 2222222222222222222222222222222222222222
1401 831.369 : 3333333333333333
1501 2106.134 : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1501 2030.204 : 2222222222222222222222222222222222222222
1501 823.828 : 3333333333333333
====< testyenc.py >==================================================================
## testyenc.py
## Orig as posted by Freddie + his correction:
## except as noted, and tabs changed to 4 spaces
## Subject: yEnc implementation in Python, bit slow
## From: Freddie <oinkfreddie at oinkshlick.oinknet>
## Message-ID: <Xns93CE3ABF6418freddiethescaryeleph at 218.100.3.9>
## Date: 5 Aug 2003 00:50:58 +1000
##
## followed by an improved version, followed by an experiment using a unicode translation table
##
def yEncode2(data):
trans = ''
for i in range(256):
trans += chr((i+42)%256)
translated = data.translate(trans)
# escape =, NUL, LF, CR
for i in (61, 0, 10, 13):
j = '=%c' % (i + 64)
translated = translated.replace(chr(i), j)
encoded = []
n = 0
for i in range(0, len(translated), 256):
chunk = translated[n+i:n+i+256]
if chunk[-1] == '=':
chunk += translated[n+i+256] # <<< this line changed per Freddie's post
# chunk += translated[n+i+256+1] <--was
n += 1
encoded.append(chunk)
encoded.append('\n')
result = ''.join(encoded)
#XXX# print len(result),
return result
#---------------------------------------------------
# above version somewhat optimized
def yEncode2x(data,
trans = ''.join([chr((i+42)%256) for i in xrange(256)]),
escapes = [(chr(i), '=%c'%(i+64)) for i in (61, 0, 10, 13)]
):
translated = data.translate(trans)
# escape =, NUL, LF, CR
for cold, cnew in escapes:
translated = translated.replace(cold, cnew)
encoded = []
start = 0; end = len(translated)
while start<end:
stop = start+256
if translated[stop-1:stop]=='=': stop+=1
encoded.append(translated[start:stop])
start = stop
encoded.append('')
return '\n'.join(encoded)
#---------------------------------------------------
def yEncode3(data,
# set up translation tables at def-time (expensive, but only once on import)
utrans = u''.join([
# >256 chars don't matter
(u>=256 and u'\x00') or
# escape if translated by +42 mod256 is in (=, NUL, LF, CR)
(((u+42)%256) in (61, 0, 10, 13) and unicode('=%c'%chr((u+42+64)%256), 'utf-16')) or
# translate others +42 mod 256
unicode('%c\x00'%chr((u+42)%256),'utf-16') for u in xrange(256*256)]),
identity = ''.join(map(chr,xrange(256)))
):
utranslated = unicode(data,'latin-1').translate(utrans)
no_lf = utranslated.encode('utf-16').translate(identity, '\x00')[2:]
ret = []; i = 0; eod = len(no_lf)
while i < eod:
top = i+256
if no_lf[top-1:top]=='=': top+=1
ret.append(no_lf[i:top])
i = top
ret.append('')
return '\n'.join(ret)
def test(nloops, factor=1562):
from time import clock
test = [chr(i) for i in xrange(256)]
teststr = ''.join(test*factor)
# sanity check
assert yEncode2(teststr) == yEncode3(teststr) == yEncode2x(teststr)
assert yEncode2(teststr[:-1]) == yEncode3(teststr[:-1]) == yEncode2x(teststr[:-1])
assert yEncode2(teststr+'?') == yEncode3(teststr+'?') == yEncode2x(teststr+'?')
t1 = clock()
for i in xrange(nloops): dummy = yEncode2(teststr)
t2 = clock()
for i in xrange(nloops): dummy = yEncode3(teststr)
t3 = clock()
for i in xrange(nloops): dummy = yEncode2x(teststr)
t4 = clock()
tenc2 = t2-t1
tenc3 = t3-t2
tenc2x = t4-t3
return tenc2, tenc3, tenc2x
def test2(opt,nloops, factorstart=1562, factorstop=1563, factorstep=1):
for factor in xrange(factorstart,factorstop,factorstep):
kb = nloops*256.0*factor/1024.
tenc2, tenc3, tenc2x = test(nloops, factor)
tenc2 = kb/tenc2; tenc3 = kb/tenc3; tenc2x = kb/tenc2x
ratio2x = tenc2x/tenc2; ratio3x=tenc2x/tenc3
if opt=='-trh':
print '%3s %10.6f x/2: %s'%(factor, ratio2x,'2'*int(ratio2x*5))
print '%3s %10.6f x/3: %s'%(factor, ratio3x,'3'*int(ratio3x*5))
elif opt=='-tkb':
print '%3s: e2x kb/s = %10.3f, e2 kb/s = %10.3f, e3 kb/s = %10.3f' %(factor,tenc2x, tenc2, tenc3)
elif opt=='-tkh':
print '%3s %10.3f : %s'%(factor, tenc2x,'x'*int(tenc2x/50))
print '%3s %10.3f : %s'%(factor, tenc2,'2'*int(tenc2/50))
print '%3s %10.3f : %s'%(factor, tenc3,'3'*int(tenc3/50))
else: raise ValueError, 'test2 only undestands one of: -trh -tkb -tkg'
if __name__ == '__main__':
def boxit(s): hbar = '-'*len(s); return ' +-%s-+\n | %s |\n +-%s-+' % (hbar, s, hbar)
import sys
args = sys.argv[1:]
try:
if args and args[0].startswith('-t'):
test2(args[0], *map(int,args[1:]))
else:
tenc2, tenc3, tenc2x = test(*map(int,sys.argv[1:]))
print (
'Ratios of total test times over version 2x:\n'
'yEncode2: %f ratio/2x: %f\n'
'yEncode3: %f ratio/2x: %f\n'
'yEncode2x: %f ratio/2x: %f' % (
tenc2, tenc2/tenc2x, tenc3, tenc3/tenc2x, tenc2x, tenc2x/tenc2x)
)
except Exception, e:
box = boxit('%s: %s'%(e.__class__.__name__, e))
print """
%(box)s
Usage: testyenc.py nloops factor <1562>
for nloops with factor*<256-byte-test-strings>,
outputting total times and ratios, or
testyenc.py opt nloops [start <1562> [stop <1563> [step <1>]]]
for testing the same with a range of factors, and
where <> is for defaults and opt:
-trh for ratio histogram of enc2x kb/sec over enc2 and enc3 rates
-tkb for kb/sec numbers for the three versions
-tkh for a kb/sec histogram
""" % vars()
=====================================================================================
Regards,
Bengt Richter
More information about the Python-list
mailing list