<p>Your approach (doing the right thing for both Python and C, new API to avoid the C performance problem) sounds good to me. </p>

<p>--<br>

Nick Coghlan (via Gmail on Android, so likely to be more terse than usual)</p>

<div class="gmail_quote">On Nov 4, 2011 7:58 AM, Martin v. Löwis &lt;<a href="mailto:martin@v.loewis.de">martin@v.loewis.de</a>&gt; wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

&gt; I started such hack for the UTF-8 codec... It is really tricky, we should not<br>

&gt; do that!<br>

<br>

With the proper encapsulation, it&#39;s not that tricky. I have written<br>

functions PyUnicode_IndexToWCharIndex and PyUnicode_WCharIndexToIndex,<br>

and PyUnicodeEncodeError_GetStart and friends would use that function.<br>

I&#39;d also need new functions PyUnicodeEncodeError_GetStartIndex to access<br>

the &quot;true&quot; start field.<br>

<br>

&gt;&gt; That would be expensive to compute<br>

&gt;<br>

&gt; Yeah, O(n) should be avoided when is it possible.<br>

<br>

Ok. I&#39;ll wait half a day or so for people to reconsider (now knowing<br>

that it&#39;s actually feasible to be fully backwards compatible); if nobody<br>

speaks up, I go ahead and accept the breakage.<br>

<br>

Regards,<br>

Martin<br>

_______________________________________________<br>

Python-Dev mailing list<br>

<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>

<a href="http://mail.python.org/mailman/listinfo/python-dev" target="_blank">http://mail.python.org/mailman/listinfo/python-dev</a><br>

Unsubscribe: <a href="http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com" target="_blank">http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com</a><br>

</blockquote></div>