[New-bugs-announce] [issue5793] Rationalize isdigit / isalpha / tolower / ... uses throughout Python source
Mark Dickinson
report at bugs.python.org
Sun Apr 19 14:57:47 CEST 2009
New submission from Mark Dickinson <dickinsm at gmail.com>:
Problem: the standard C character handling functions from ctype.h
(isalpha, isdigit, isxdigit, isspace, toupper, tolower, etc.) are locale
aware, but for almost all uses CPython needs locale-unaware versions of
these.
There are various solutions in the current source:
- there's a file Include/bytes_methods.h which provides suitable
ISDIGIT/ISALPHA/... macros, but also undefines the standard functions.
As it is, it can't be included in Python.h since that would break
3rd party code that includes Python.h and also uses isdigit.
- some files have their own solution: Python/pystrtod.c defines
its own (probably inefficient) ISDIGIT and ISSPACE macros.
- in some places the standard C functions are just used directly (and
possibly incorrectly). A gotcha here is that one has to remember to use
Py_CHARMASK to avoid errors on some platforms. (See issue 3633 for an
example.)
It would be nice to clean all this up, and have one central, efficient,
easy-to-use set of Py_ISDIGIT/Py_ISALPHA ... locale-independent macros (or
functions) that could be used safely throughout the Python source.
----------
components: Interpreter Core
keywords: easy
messages: 86170
nosy: eric.smith, marketdickinson
priority: normal
severity: normal
stage: needs patch
status: open
title: Rationalize isdigit / isalpha / tolower / ... uses throughout Python source
type: feature request
versions: Python 2.7, Python 3.1
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5793>
_______________________________________
More information about the New-bugs-announce
mailing list