A way to subscript a single integer from bytes

Hi all, So I'm pretty sure everyone here is familiar with how the "bytes" object works in Python 3. It acts mostly like a string, with the exception that 0-dimensional subscripting (var[idx]) returns an integer, not a bytes object - the integer being the ordinal number of the corresponding character. However, 1-dimensional subscripting (var[idx1:idx2]) returns a bytes object. Example: >>> a = b'hovercraft' >>> a[0] 104 >>> a[4:8] b'rcra' Though this isn't exactly unexpected behavior (it's not possible to accidentally do 1-dimensional subscripting and expect an integer it's a different syntax), it's still a shame that it isn't possible to quickly and easily subscript an integer out of it. Following up from the previous example, The only way to get 493182234161465432041076 out of b'hovercraft' in a single expression is as follows: list(__import__('itertools').accumulate((i for i in a), lambda x, y: (x << 8) + y))[-1] Now, I'm not proposing changing the 1-dimensional subscripting syntax to return an integer - that would be backwards incompatible, tsk tsk! No, instead, I'm simply suggesting a method of bytes objects, which would do something like this (assume the method is called "subint"): >>> a = b'hovercraft' >>> a.subint(0, -1) # -1 is equivalent to len(a) 493182234161465432041076 Much as I would think that such subscripting would deserve special syntax (perhaps bytes{idx1:idx2}), I don't think this special case is special enough to break the rules. So I'm sticking with the method idea. What are your thoughts? Sincerely, Ken;

On 1 May 2018 at 21:30, Antoine Pitrou <solipsis@pitrou.net> wrote:
It's also worth noting that if there's more than one integer of interest in the string, than using the struct module is often going to be better than using multiple slices and int.from_bytes calls: >>> import struct >>> data = b"hovercraft" >>> struct.unpack(">IIH", data) (1752135269, 1919119969, 26228) (The struct module doesn't handle arbitrary length integers, but it handles 8, 16, 32, and 64 bit ones, which is enough for a lot of common use cases) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, May 01, 2018 at 07:22:52PM +0800, Ken Hilton wrote:
The only way to get 493182234161465432041076 out of b'hovercraft'
You seem to be using a bytes object as a base-256 number. Under what circumstances is this desirable?
in a single expression is as follows:
What's so special about this operation that it needs to be a single expression? The easy way to do this is: from functools import reduce # not needed in Python 2 reduce(lambda m, n: m*256 + n, b'hovercraft', 0) Do the import once at the top of your module, and then you can call reduce as many times as you like. If you really, really, really want to make it a one-liner, perhaps to win a bet, or because the Enter key on your keyboard is broken, then you can use the __import__('functools') trick. __import__('functools').reduce(lambda m, n: m*256 + n, b'hovercraft', 0) But don't do that.
list(__import__('itertools').accumulate((i for i in a), lambda x, y: (x << 8) + y))[-1]
'(i for i in a)' is best written as 'iter(a)' if you must have an iterator, or just 'a' if you don't care what sort of iterable it is.
I don't even know why you would want to do it in the first place, let alone why you think it is special enough to dedicate syntax to doing it. -- Steve

On 1 May 2018 at 21:30, Antoine Pitrou <solipsis@pitrou.net> wrote:
It's also worth noting that if there's more than one integer of interest in the string, than using the struct module is often going to be better than using multiple slices and int.from_bytes calls: >>> import struct >>> data = b"hovercraft" >>> struct.unpack(">IIH", data) (1752135269, 1919119969, 26228) (The struct module doesn't handle arbitrary length integers, but it handles 8, 16, 32, and 64 bit ones, which is enough for a lot of common use cases) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, May 01, 2018 at 07:22:52PM +0800, Ken Hilton wrote:
The only way to get 493182234161465432041076 out of b'hovercraft'
You seem to be using a bytes object as a base-256 number. Under what circumstances is this desirable?
in a single expression is as follows:
What's so special about this operation that it needs to be a single expression? The easy way to do this is: from functools import reduce # not needed in Python 2 reduce(lambda m, n: m*256 + n, b'hovercraft', 0) Do the import once at the top of your module, and then you can call reduce as many times as you like. If you really, really, really want to make it a one-liner, perhaps to win a bet, or because the Enter key on your keyboard is broken, then you can use the __import__('functools') trick. __import__('functools').reduce(lambda m, n: m*256 + n, b'hovercraft', 0) But don't do that.
list(__import__('itertools').accumulate((i for i in a), lambda x, y: (x << 8) + y))[-1]
'(i for i in a)' is best written as 'iter(a)' if you must have an iterator, or just 'a' if you don't care what sort of iterable it is.
I don't even know why you would want to do it in the first place, let alone why you think it is special enough to dedicate syntax to doing it. -- Steve
participants (4)
-
Antoine Pitrou
-
Ken Hilton
-
Nick Coghlan
-
Steven D'Aprano