08.08.21 07:08, Stephen J. Turnbull пише:
Serhiy Storchaka writes:
Python integers have arbitrary precision. For serialization and interpolation with other programs and libraries we need to represent them [...]. [In the case of non-standard precisions,] [t]here are private C API functions _PyLong_AsByteArray and _PyLong_FromByteArray, but they are for internal use only.
I am planning to add public analogs of these private functions, but more powerful and convenient.
PyObject *PyLong_FromBytes(const void *buf, Py_ssize_t size, int byteorder, int signed)
Py_ssize_t PyLong_AsBytes(PyObject *o, void *buf, Py_ssize_t n, int byteorder, int signed, int *overflow)
I don't understand why such a complex API is useful as a public facility.
There are several goals: 1. Support conversion to/from all C integer types (char, short, int, long, long long, intN_t, intptr_t, intmax_t, wchar_t, wint_t and corresponding unsigned types), POSIX integer types (pid_t, uid_t, off_t, etc) and other platfrom or library specific integer types (like Tcl_WideInt in libtcl). Currently only supported types are long, unsigned long, long long, unsigned long, ssize_t and size_t. For other types you should choose the most appropriate supertype (long or long long, sometimes providing several varians) and manually handle overflow. There are requests for PyLong_AsShort(), PyLong_AsInt32(), PyLong_AsMaxInt(), etc. It is better to provide a single universal function than extend API by several dozens functions. 2. Support different options for overflow handling. Different options are present in PyLong_AsLong(), PyLong_AsLongAndOverflow(), PyLong_AsUnsignedLongMask() and PyNumber_AsSsize_t(). But not all options are available for all types. There is no *AndOverflow() variant for unsigned types, size_t, ssize_t, and saturation is only available for ssize_t. 3. Support serialization of arbitrary precision integers. It is used in pickle and random, and can be used to support other binary data formats. All these goals can be achieved by few universal functions.
So I might want PyLong_AsGMPInt and PyLong_AsGMPRatio as well as the corresponding functions for MP, and maybe even PyLong_AsGMPFloat. The obvious way to write those is <library constructor>(str(python_integer)), I think.
PyLong_AsGMPInt() cannot be added until GMP be included in Python interpreter, and it is very unlikely. Converting via decimal representation is very inefficient way, especially for very long integers (it has cubic complexity from the size of the integer). I think GMP support more efficient conversions.
In the unlikely event that an application needs to squeeze out that tiny bit of performance, I guess the library constructors all accept buffers of bytes, too, probably with a similarly complex API that can handle whatever the Python ABI throws at them.
For using the library constructors accepting buffers of bytes we need buffers of bytes. And the proposed functions provide the only interface for conversion Python integers to/from buffer of bytes.
In which case why not just expose the internal functions?
If you mean _PyLong_FromByteArray/_PyLong_AsByteArray, it is because we should polish them before exposing them. They currently do not provide different options for overflow, and I think that it may be more convenient way for common case of native bytes order. The names of functions, the number and order of parameters can be discussed. For such discussion I opened this thread. If you have alternative propositions, please show them.
Is it at all likely that that representation would ever change?
They do not rely on internal representation. They are for implementation-indepent representation.