Re: Integer concatenation to byte string
We where no longer on the ideas list...
On 2 Mar 2021, at 13:04, Memz
wrote: There is no specific scenario it solves. The lack of efficiency of the timed code should speak for itself. Non-mutable bytes is a limit of python, since it's reliant on using function calls.
b"\x00\x00\x00\x00\x00" bytearray( b"\x00\x00\x00\x00\x00" ) struct.pack("iiiii",0,0,0,0,0) b"\x00\x00\x00\x00\x00" + bytes([1,0,0,0,0])
You mean the above is what you timed? It is not a realistic problem you are measuring. You also did not share how you measured the the code. If you are not experienced in benchmarking there are variables that you must control for to have meaningful results.
All function calls take time and resources, it would be impossible to streamline a function call to make it faster than building it in. This goes for most languages, including C.
All python byte code is interpreted by calling functions. They take time and resources. Barry
On Tue, Mar 2, 2021 at 3:29 AM Barry Scott
mailto:barry@barrys-emacs.org> wrote: On 2 Mar 2021, at 02:03, Memz
mailto:mmax42852@gmail.com> wrote: When I needed to do the creation of bytes objects from a mix of types the struct.pack() method has been the obvious way to go.
What is the use case that leads to needing the above?
Barry
The use of my suggestion is to reduce the reliance of function calls for bytes and bytearrays, which currently can only be done through function calls, and making it more efficient all-together. Here is a timed version of this: b-strings: 5-5.4 s bytearray() function call: 67.9 s struct.pack() 80.9 s b-string + bytes([1,...]) 54.3 s
What code are you benchmarking? What problem does that code solve?
My experience with creating byte objects is that struct is the fastest way to get the job done. But that could be becuase of the problems that I need bytes for. For example calling ioctl().
Moving bytearray to a non-function call would overhaul and optimize code that works with bytes, increase flexibility, and reduce reliance on imports, including struct.pack(). There is a lot of code that could, and should be using bytearray but can't, because other, more slow and painful than should be methods are more efficient because of the function call nature of bytearray.
You are assuming that the problem is the function calls. Surely its the algorithms that lead to issues in most code that is slow?
Barry
Memz, Please keep your responses on the mailing list. On Tue, Mar 02, 2021 at 08:07:39PM +0000, Barry Scott wrote:
On 2 Mar 2021, at 13:04, Memz
wrote: There is no specific scenario it solves. The lack of efficiency of the timed code should speak for itself. Non-mutable bytes is a limit of python, since it's reliant on using function calls.
"Lack of efficiency" doesn't speak for itself. You haven't shown how you benchmarked this, so we don't know if it is a valid comparison or not, but generally speaking I will allow that there is some function call overhead in Python. In this case you have: - create a bytes string object; - look up the name bytesarray, which requires two dict lookups (one in the global scope that fails, one in the builtins scope that succeeds); - then call the function with the bytes string object as argument; - and finally the bytes object is garbage collected. So it's reasonable to assume that this has some overhead. The overhead might even be significant if, for example, you create a temporary 10 GB byte string so you can append one byte to the end. But we don't typically care about optimizing for such unusual and extreme cases. If you are trying to squeeze out every last nanosecond of performance, you're probably using the wrong language. Or at least the wrong interpreter. You might like to try PyPy, or some of the other specialising interpreters. Or write your critical code in Cython, or use ctypes, or write it as a C extension. But honestly, I expect that you are falling into the trap of premature optimization. I presume that once you have your mutable bytearray object, you're actually going to do some work with it. It is quite likely that for any real example, not made-up Mickey-Mouse toy code, the time it takes to initialise the byte array object will be a negligible fraction of the time it takes your application to actually process the byte array object. Who cares if it takes 130 nanoseconds to initialise the byte array object, if you then go on to spend ten million nanoseconds working with it? We don't typically make large language changes for the sake of micro-benchmarks. [steve ~]$ python3.9 -m timeit "bytearray(b'abcdefghijklmnop')" 2000000 loops, best of 5: 131 nsec per loop It's not inconceivable that in a tight loop where you have to make many bytearrays but do very little with them, the initialisation cost is significant. But to justify adding literal syntax to the language we would need to see some strong justification that the function call overhead not only is significant, but it is *frequently* a bottleneck in the code. [Barry]
All python byte code is interpreted by calling functions. They take time and resources.
That's not entirely correct. Literals such as text strings, ints and floats get compiled directly into the byte-code. Now of course there is some overhead while executing the byte-code, but that doesn't include the heavy cost of a Python function call. -- Steve
On 2 Mar 2021, at 23:49, Steven D'Aprano
wrote: [Barry]
All python byte code is interpreted by calling functions. They take time and resources.
That's not entirely correct. Literals such as text strings, ints and floats get compiled directly into the byte-code. Now of course there is some overhead while executing the byte-code, but that doesn't include the heavy cost of a Python function call.
I was thinking of the C functions that are executed in ceval.c to run the interpreter for any byte code. Barry
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GTPEG... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Mar 3, 2021, at 04:09, Barry Scott wrote:
I was thinking of the C functions that are executed in ceval.c to run the interpreter for any byte code.
In that case, it's not clear how your proposed syntax would not have the same overhead [especially your suggestion of a += operator]
participants (3)
-
Barry Scott
-
Random832
-
Steven D'Aprano