Re: [pypy-dev] [pypy-commit] pypy py3.3: Fix unicode.capitalize() test to pass with CPython3.3,
Those things are probably why pypy3 is so much slower than pypy2 On Mon, Feb 1, 2016 at 12:42 AM, amauryfa <pypy.commits@gmail.com> wrote:
Author: Amaury Forgeot d'Arc <amauryfa@gmail.com> Branch: py3.3 Changeset: r82021:44aa48e4d16a Date: 2016-02-01 00:31 +0100 http://bitbucket.org/pypy/pypy/changeset/44aa48e4d16a/
Log: Fix unicode.capitalize() test to pass with CPython3.3, and implement it for PyPy. Probably not the fastest implementation...
diff --git a/pypy/objspace/std/test/test_unicodeobject.py b/pypy/objspace/std/test/test_unicodeobject.py --- a/pypy/objspace/std/test/test_unicodeobject.py +++ b/pypy/objspace/std/test/test_unicodeobject.py @@ -217,7 +217,7 @@ # check that titlecased chars are lowered correctly # \u1ffc is the titlecased char assert ('\u1ff3\u1ff3\u1ffc\u1ffc'.capitalize() == - '\u1ffc\u1ff3\u1ff3\u1ff3') + '\u03a9\u0399\u1ff3\u1ff3\u1ff3') # check with cased non-letter chars assert ('\u24c5\u24ce\u24c9\u24bd\u24c4\u24c3'.capitalize() == '\u24c5\u24e8\u24e3\u24d7\u24de\u24dd') diff --git a/pypy/objspace/std/unicodeobject.py b/pypy/objspace/std/unicodeobject.py --- a/pypy/objspace/std/unicodeobject.py +++ b/pypy/objspace/std/unicodeobject.py @@ -155,13 +155,16 @@ return unicodedb.islinebreak(ord(ch))
def _upper(self, ch): - return unichr(unicodedb.toupper(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.toupper_full(ord(ch))])
def _lower(self, ch): - return unichr(unicodedb.tolower(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.tolower_full(ord(ch))])
def _title(self, ch): - return unichr(unicodedb.totitle(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.totitle_full(ord(ch))])
def _newlist_unwrapped(self, space, lst): return space.newlist_unicode(lst) _______________________________________________ pypy-commit mailing list pypy-commit@python.org https://mail.python.org/mailman/listinfo/pypy-commit
2016-02-01 9:59 GMT+01:00 Maciej Fijalkowski <fijall@gmail.com>:
Those things are probably why pypy3 is so much slower than pypy2
Agreed. I prefer to have things correct first, and work on performance later. For this particular case though, do you have an idea on how to improve this? Pass the UnicodeBuilder to unicodedb.toupper_full()?
On Mon, Feb 1, 2016 at 12:42 AM, amauryfa <pypy.commits@gmail.com> wrote:
Author: Amaury Forgeot d'Arc <amauryfa@gmail.com> Branch: py3.3 Changeset: r82021:44aa48e4d16a Date: 2016-02-01 00:31 +0100 http://bitbucket.org/pypy/pypy/changeset/44aa48e4d16a/
Log: Fix unicode.capitalize() test to pass with CPython3.3, and implement it for PyPy. Probably not the fastest implementation...
diff --git a/pypy/objspace/std/test/test_unicodeobject.py b/pypy/objspace/std/test/test_unicodeobject.py --- a/pypy/objspace/std/test/test_unicodeobject.py +++ b/pypy/objspace/std/test/test_unicodeobject.py @@ -217,7 +217,7 @@ # check that titlecased chars are lowered correctly # \u1ffc is the titlecased char assert ('\u1ff3\u1ff3\u1ffc\u1ffc'.capitalize() == - '\u1ffc\u1ff3\u1ff3\u1ff3') + '\u03a9\u0399\u1ff3\u1ff3\u1ff3') # check with cased non-letter chars assert ('\u24c5\u24ce\u24c9\u24bd\u24c4\u24c3'.capitalize() == '\u24c5\u24e8\u24e3\u24d7\u24de\u24dd') diff --git a/pypy/objspace/std/unicodeobject.py b/pypy/objspace/std/unicodeobject.py --- a/pypy/objspace/std/unicodeobject.py +++ b/pypy/objspace/std/unicodeobject.py @@ -155,13 +155,16 @@ return unicodedb.islinebreak(ord(ch))
def _upper(self, ch): - return unichr(unicodedb.toupper(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.toupper_full(ord(ch))])
def _lower(self, ch): - return unichr(unicodedb.tolower(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.tolower_full(ord(ch))])
def _title(self, ch): - return unichr(unicodedb.totitle(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.totitle_full(ord(ch))])
def _newlist_unwrapped(self, space, lst): return space.newlist_unicode(lst) _______________________________________________ pypy-commit mailing list pypy-commit@python.org https://mail.python.org/mailman/listinfo/pypy-commit
pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
-- Amaury Forgeot d'Arc
The problem is usually that "performance later" is never done because noone looks at those places On Mon, Feb 1, 2016 at 11:06 AM, Amaury Forgeot d'Arc <amauryfa@gmail.com> wrote:
2016-02-01 9:59 GMT+01:00 Maciej Fijalkowski <fijall@gmail.com>:
Those things are probably why pypy3 is so much slower than pypy2
Agreed. I prefer to have things correct first, and work on performance later. For this particular case though, do you have an idea on how to improve this? Pass the UnicodeBuilder to unicodedb.toupper_full()?
On Mon, Feb 1, 2016 at 12:42 AM, amauryfa <pypy.commits@gmail.com> wrote:
Author: Amaury Forgeot d'Arc <amauryfa@gmail.com> Branch: py3.3 Changeset: r82021:44aa48e4d16a Date: 2016-02-01 00:31 +0100 http://bitbucket.org/pypy/pypy/changeset/44aa48e4d16a/
Log: Fix unicode.capitalize() test to pass with CPython3.3, and implement it for PyPy. Probably not the fastest implementation...
diff --git a/pypy/objspace/std/test/test_unicodeobject.py b/pypy/objspace/std/test/test_unicodeobject.py --- a/pypy/objspace/std/test/test_unicodeobject.py +++ b/pypy/objspace/std/test/test_unicodeobject.py @@ -217,7 +217,7 @@ # check that titlecased chars are lowered correctly # \u1ffc is the titlecased char assert ('\u1ff3\u1ff3\u1ffc\u1ffc'.capitalize() == - '\u1ffc\u1ff3\u1ff3\u1ff3') + '\u03a9\u0399\u1ff3\u1ff3\u1ff3') # check with cased non-letter chars assert ('\u24c5\u24ce\u24c9\u24bd\u24c4\u24c3'.capitalize() == '\u24c5\u24e8\u24e3\u24d7\u24de\u24dd') diff --git a/pypy/objspace/std/unicodeobject.py b/pypy/objspace/std/unicodeobject.py --- a/pypy/objspace/std/unicodeobject.py +++ b/pypy/objspace/std/unicodeobject.py @@ -155,13 +155,16 @@ return unicodedb.islinebreak(ord(ch))
def _upper(self, ch): - return unichr(unicodedb.toupper(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.toupper_full(ord(ch))])
def _lower(self, ch): - return unichr(unicodedb.tolower(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.tolower_full(ord(ch))])
def _title(self, ch): - return unichr(unicodedb.totitle(ord(ch))) + return u''.join([unichr(x) for x in + unicodedb.totitle_full(ord(ch))])
def _newlist_unwrapped(self, space, lst): return space.newlist_unicode(lst) _______________________________________________ pypy-commit mailing list pypy-commit@python.org https://mail.python.org/mailman/listinfo/pypy-commit
pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev
-- Amaury Forgeot d'Arc
participants (2)
-
Amaury Forgeot d'Arc
-
Maciej Fijalkowski