2008/8/22 Fredrik Lundh <<a href="mailto:fredrik@pythonware.com">fredrik@pythonware.com</a>>:<br>> On Fri, Aug 22, 2008 at 4:59 PM, Guido van Rossum <<a href="mailto:guido@python.org">guido@python.org</a>> wrote:<br>
><br>>>> (how's the 3.2/4.1 dual support implemented? do we have two distinct<br>>>> datasets, or are the differences encoded in some clever way? would it<br>>>> make sense to split the unicodedata module into three separate<br>
>>> modules, one for each major Unicode version?)<br>>><br>>> The current API looks fine to me: unicodedata is the latest version<br>>> whereas unicodedata.ucd_3_2_0 is the older version. The APIs are the<br>
>> same; there's a tiny bit of code in the generated _db.h file that<br>>> expresses the differences:<br>>><br>>> static const change_record* get_change_3_2_0(Py_UCS4 n)<br>>> {<br>>> int index;<br>
>> if (n >= 0x110000) index = 0;<br>>> else {<br>>> index = changes_3_2_0_index[n>>7];<br>>> index = changes_3_2_0_data[(index<<7)+(n & 127)];<br>
>> }<br>>> return change_records_3_2_0+index;<br>>> }<br>><br>> there's a bunch of data tables as well, but they don't seem to be very<br>> large. looks like Martin did a thorough job here.<br>
><br>> ... digging digging digging ...<br>><br>> yes, the generator script produces difference tables between the main<br>> version and a list of older versions. I'd say it's worth running the<br>> script on the 5.1.0 tables, and if it doesn't choke, compare the<br>
> resulting table with the corresponding table for 4.1.0 (a simple loop<br>> fetching the main properties for all code points). if the differences<br>> look reasonably small, switch 5.1.0 and keep the others.<br>
<br>Right, that's my hope as well. I believe the changes between 3.2 and 4.1 were much larger than more recent changes. (Yay convergence! :-)<br><br>> I can tinker a little with this over the weekend, unless Martin tells<br>
> me not to ;-)<br><br>That would be great!<br><br>-- <br>--Guido van Rossum (home page: <a href="http://www.python.org/~guido/">http://www.python.org/~guido/</a>)<br><br>