[Python-Dev] FW: Unicode character name hashing
Favas, Mark (EM, Floreat)
Mark.Favas@per.dem.csiro.au
Fri, 14 Jul 2000 13:18:42 +0800
Forgot to copy Python-Dev...
-----Original Message-----
From: Mark Favas [mailto:m.favas@per.dem.csiro.au]
Sent: Friday, 14 July 2000 7:00 AM
To: Bill Tutt
Subject: Re: Unicode character name hashing
Just tried it, and got the same message:
test_ucn
test test_ucn crashed -- exceptions.UnicodeError : Unicode-Escape
decoding error: Invalid Unicode Character Name
Cheers,
Mark
Bill Tutt wrote:
>
> Does this patch happen to fix it?
> I'm afraid my skills relating to signed overflow is a bit rusty... :(
>
> Bill
>
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Modules/ucnhash.c,v
> retrieving revision 1.2
> diff -u -r1.2 ucnhash.c
> --- ucnhash.c 2000/06/29 00:06:39 1.2
> +++ ucnhash.c 2000/07/13 21:41:07
> @@ -30,12 +30,12 @@
>
> len = cch;
> p = (unsigned char *) key;
> - x = 1694245428;
> + x = (long)0x64fc2234;
> while (--len >= 0)
> - x = (1000003*x) ^ toupper(*(p++));
> + x = ((0xf4243 * x) & 0xFFFFFFFF) ^ toupper(*(p++));
> x ^= cch + 10;
> - if (x == -1)
> - x = -2;
> + if (x == (long)0xFFFFFFFF)
> + x = (long)0xfffffffe;
> x %= k_cHashElements;
> /* ensure the returned value is positive so we mimic Python's %
> operator */
> if (x < 0)
> @@ -52,12 +52,12 @@
>
> len = cch;
> p = (unsigned char *) key;
> - x = -1917331657;
> + x = (long)0x8db7d737;
> while (--len >= 0)
> - x = (1000003*x) ^ toupper(*(p++));
> + x = ((0xf4243 * x) & 0xFFFFFFFF) ^ toupper(*(p++));
> x ^= cch + 10;
> - if (x == -1)
> - x = -2;
> + if (x == (long)0xFFFFFFFF)
> + x = (long)0xfffffffe;
> x %= k_cHashElements;
> /* ensure the returned value is positive so we mimic Python's %
> operator */
> if (x < 0)
>
> -----Original Message-----
> From: Mark Favas [mailto:m.favas@per.dem.csiro.au]
> Sent: Thursday, July 13, 2000 1:16 PM
> To: python-dev@python.org; Bill Tutt
> Subject: Unicode character name hashing
>
> [Bill has epiphany]
> >I just had a rather unhappy epiphany this morning.
> >F1, and f2 in ucnhash.c might not work on machines where sizeof(long) >!=
> 32 bits.
>
> I get the following from test_ucn on an Alpha running Tru64 Unix:
>
> python Lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character
> Name
>
> This is with the current CVS - and it's been failing this test for some
> time now. I'm happy to test any fixes...
>
> --
> Email - m.favas@per.dem.csiro.au Mark C Favas
> Phone - +61 8 9333 6268, 0418 926 074 CSIRO Exploration & Mining
> Fax - +61 8 9383 9891 Private Bag No 5, Wembley
> WGS84 - 31.95 S, 115.80 E Western Australia 6913
--
Email - m.favas@per.dem.csiro.au Mark C Favas
Phone - +61 8 9333 6268, 0418 926 074 CSIRO Exploration & Mining
Fax - +61 8 9383 9891 Private Bag No 5, Wembley
WGS84 - 31.95 S, 115.80 E Western Australia 6913