[Python-bugs-list] [ python-Bugs-549731 ] Unicode encoders appears to leak references

noreply@sourceforge.net noreply@sourceforge.net
Thu, 18 Jul 2002 16:07:01 -0700


Bugs item #549731, was opened at 2002-04-28 19:17
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=549731&group_id=5470

Category: Unicode
Group: Python 2.2.1 candidate
Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Unicode encoders appears to leak references

Initial Comment:
Note the following Debug Python session:

>>> s=u"anything"
[8189 refs]
>>> v=s.encode("utf8")
[10967 refs]
>>> v=s.encode("utf8")
[10968 refs]
>>> v=s.encode("utf8")
[10969 refs]
>>> v=s.encode("utf8")
[10970 refs]

Each call to encode is losing a reference.  Attaching a
test program that demonstrates this in more detail. 
The output from my test program is:

After 10000 iterations, lost 12850 references
[15227 refs]

and for 100000:
After 100000 iterations, lost 102850 references
[105227 refs]

etc.

As far as I can tell, this appears in all Python 2.x
versions.

----------------------------------------------------------------------

>Comment By: Mark Hammond (mhammond)
Date: 2002-07-19 09:07

Message:
Logged In: YES 
user_id=14198

Checking in codecs.c;
/cvsroot/python/python/dist/src/Python/codecs.c,v  <--  codecs.c
new revision: 2.14; previous revision: 2.13
done


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-07-18 23:34

Message:
Logged In: YES 
user_id=38388

Perfect. I've marked it as Python 2.2.1 candidate. Please
also mention this in the checkin message.

Thanks. (And sorry for not getting back earlier -- my days
are indeed *very* long ;-)

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2002-07-18 09:07

Message:
Logged In: YES 
user_id=14198

A tickle for Marc, assuming his days aren't quite *that*
long <wink>.  Just give the OK and I will check it in.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-06-05 03:25

Message:
Logged In: YES 
user_id=33168

Basically the code in the report would be fine.
Purify *should* catch anything which causes the leak.
So:
    s = u'anything'
    assert(s.encode('utf-8') == s.encode('utf-8'))

should work.  Perhaps, there is already a test for this?
and purify didn't report leaks.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-06-04 17:20

Message:
Logged In: YES 
user_id=38388

I'll have a look later today.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2002-06-04 14:42

Message:
Logged In: YES 
user_id=14198

damn sourceforge - it went to the trouble of asking my email
address when I submitted without being logged in, but it
doesn't seem to have done anything with it - so that was me
just incase you weren't sure :)

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-06-04 14:39

Message:
Logged In: NO 

I'm not sure what sort of test you are suggesting I add.  I
think the patch is pretty obvious and reasonable, so MAL
should just check it in or assign it back to me <wink>. 
Earlier the better really.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-06-04 11:35

Message:
Logged In: YES 
user_id=33168

Patch makes sense to me.
If you add a test, I may be able to catch the problem w/purify
next time I run it (if purify works).

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2002-04-28 20:39

Message:
Logged In: YES 
user_id=14198

Oops - too quick.  All calls to _PyCodec_Lookup() leak.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2002-04-28 20:05

Message:
Logged In: YES 
user_id=14198

Found it :)  Attaching patch.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2002-04-28 19:26

Message:
Logged In: YES 
user_id=14198

s/decode/encode/ :)  Also meant to mention problem not
restricted to UTF8 - changing the encoding in the text file
to anything other than 'ascii' seems to leak in the same way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=549731&group_id=5470