String to number conversion

I miss a function able to convert the string representation of a Python primitive number (int, long, float, complex) to an actual number, basically the reverse of repr for numbers. The only option I know of that doesn't involve inspecting the individual digits is eval, which doesn't cut it -- besides the obvious security implications, it tends to be slow when used en masse. What I have in mind is something along the lines of:
A sample implementation, a variation of which I use in an in-house project, follows. It does inspect the string to determine its type, but does so in a simple C loop which is quite fast for numeric input. It leaves the actual conversion to the Python's API code that knows what it's doing. Because of that it's simple and fast, but still (to my knowledge) correct.
PyObject * read_number(PyObject *str) { int isfloat = 0, iscomplex = 0;
/* First, intuit the type based on characters that appear in the string. */ const char *s = PyString_AS_STRING(str); const char *end = s + PyString_GET_SIZE(str); const char *q; for (q = s; q < end; q++) { char c = *q; if (c == '.' || c == 'e') isfloat = 1; else if (c == 'j') iscomplex = 1; else if (c != '-' && c != '+' && !isdigit(c) && !isspace(c)) { PyErr_Format(PyExc_ValueError, "invalid numeric value '%s'", s); return NULL; } }
/* Now that the type is known, construct the number, leaving the actual error checking to the constructors. */ if (iscomplex) return PyObject_CallFunctionObjArgs((PyObject *) &PyComplex_Type, str); else if (isfloat) return PyFloat_FromString(str, NULL); else /* handles ints and longs */ return PyInt_FromString((char *) s, NULL, 10); }
Would anyone else find this kind of function useful?

On 2008-05-07 17:27, Hrvoje Niksic wrote:
Yes.
Perhaps as PyNumber_FromString() ?!
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

On 7 maj 2008, at 17.46, M.-A. Lemburg wrote:
I rarely wrap primitives in objects, as this increases both memory
usage, complexity and overhead. (However, sometimes you _might_ need
to, but I can't come up with a scenario)
So, in most cases this would probably happen:
PyObject *num = PyNumber_FromString(num_s); if(!PyLong_Check(num)) { // Error... } else { self->event_id = PyLong_AsLongLong(num); }
And you lose, because this would be simpler:
self->event_id = strtoll(PyString_AsString(num_s), (char *)NULL, 10); if(errno == EINVAL) { // Error... }
Meaning, you rarely need to take input which can be any kind of number
manifested as a string and return it as any number manifested as a
native number. (The examples above indicate the output type is known
(signed 64bit integer))
Could you maybe give a real-world scenario where this would be needed?
Where neither the input not the output number type is known.
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ Rasmus Andersson Spotify +46 733 117 326

On 2008-05-07 18:28, Rasmus Andersson wrote:
Not really. The point of the PyNumber_* API is to work on numbers without actually caring or knowing the specific number types (Include/abstract.h).
You'd only do the final conversion to a specific number type at the very end of the calculation using e.g. PyNumber_Int().
In some cases, not even that, since all you're interested in is converting some object to a number object and then passing that to e.g. marshal.
Parsing numeric data and converting it to some other format.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

On 7 maj 2008, at 18.40, M.-A. Lemburg wrote:
Yes I know, but I can't really see a big need for this kind of
functionality in practice, but please, enlighten me! (maybe with a
scenario)
By the way, great to see something happing on this list. Has been
awfully quiet! :)
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ Rasmus Andersson Spotify +46 733 117 326

On 2008-05-07 18:50, Rasmus Andersson wrote:
Didn't I just give you a few ?
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

On 2008-05-07 19:14, Rasmus Andersson wrote:
You asked for example of using the PyNumber_* API. All of these can use strings as input for the processing chain:
work with numbers regardless of type, do the final conversion at the end
(you only care about the final type)
work with numbers, creating Python objects that can be passed to other Python mechanisms such as marshal
(you only care about the fact that you're dealing with a number, not the type)
parsing numeric data and preparing it for conversion to some other format, e.g. using PyString_Format()
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

Rasmus Andersson <rasmus@spotify.com> writes:
Could you maybe give a real-world scenario where this would be needed?
It is needed when writing Python numbers to a file and reading them back, reliably. In that case strtoll and friends won't work because, while they work on one data type, they fail on others (float, complex, and (Python) long in the case of strtoll).

On 2008-05-07 17:27, Hrvoje Niksic wrote:
Yes.
Perhaps as PyNumber_FromString() ?!
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

On 7 maj 2008, at 17.46, M.-A. Lemburg wrote:
I rarely wrap primitives in objects, as this increases both memory
usage, complexity and overhead. (However, sometimes you _might_ need
to, but I can't come up with a scenario)
So, in most cases this would probably happen:
PyObject *num = PyNumber_FromString(num_s); if(!PyLong_Check(num)) { // Error... } else { self->event_id = PyLong_AsLongLong(num); }
And you lose, because this would be simpler:
self->event_id = strtoll(PyString_AsString(num_s), (char *)NULL, 10); if(errno == EINVAL) { // Error... }
Meaning, you rarely need to take input which can be any kind of number
manifested as a string and return it as any number manifested as a
native number. (The examples above indicate the output type is known
(signed 64bit integer))
Could you maybe give a real-world scenario where this would be needed?
Where neither the input not the output number type is known.
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ Rasmus Andersson Spotify +46 733 117 326

On 2008-05-07 18:28, Rasmus Andersson wrote:
Not really. The point of the PyNumber_* API is to work on numbers without actually caring or knowing the specific number types (Include/abstract.h).
You'd only do the final conversion to a specific number type at the very end of the calculation using e.g. PyNumber_Int().
In some cases, not even that, since all you're interested in is converting some object to a number object and then passing that to e.g. marshal.
Parsing numeric data and converting it to some other format.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

On 7 maj 2008, at 18.40, M.-A. Lemburg wrote:
Yes I know, but I can't really see a big need for this kind of
functionality in practice, but please, enlighten me! (maybe with a
scenario)
By the way, great to see something happing on this list. Has been
awfully quiet! :)
÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷ Rasmus Andersson Spotify +46 733 117 326

On 2008-05-07 18:50, Rasmus Andersson wrote:
Didn't I just give you a few ?
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

On 2008-05-07 19:14, Rasmus Andersson wrote:
You asked for example of using the PyNumber_* API. All of these can use strings as input for the processing chain:
work with numbers regardless of type, do the final conversion at the end
(you only care about the final type)
work with numbers, creating Python objects that can be passed to other Python mechanisms such as marshal
(you only care about the fact that you're dealing with a number, not the type)
parsing numeric data and preparing it for conversion to some other format, e.g. using PyString_Format()
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, May 07 2008)
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

Rasmus Andersson <rasmus@spotify.com> writes:
Could you maybe give a real-world scenario where this would be needed?
It is needed when writing Python numbers to a file and reading them back, reliably. In that case strtoll and friends won't work because, while they work on one data type, they fail on others (float, complex, and (Python) long in the case of strtoll).
participants (4)
-
Hrvoje Niksic
-
M.-A. Lemburg
-
Rasmus Andersson
-
Rasmus Andersson