converting an array of chars to a string

Bengt Richter bokr at oz.net
Mon Jun 24 13:04:18 EDT 2002


On Mon, 24 Jun 2002 13:40:51 +0200, JB <jblazi at hotmail.com> wrote:

>Bengt Richter wrote:
>
>> Have a look at the array module. It can construct an array
>> of signed or unsigned character elements optionally
>> initialized with values from a list or string. 
>[...]
>
>Thank you very much. In principle, this was exactly, for 
>what I was looking. But most unforunately if I declare such 
>an array a, then
>
>a[0]=210
>a[0] += 46
>
>produces an error instead of the correct result 0. Now I am 
>a bit desperate.
No cause for desperation ;-)

Are you saying that you want modulo 256 arithmetic on unsigned
8-bit data? No problem. But what does your data represent? And
what does adding a number to such a data item mean in the real world?

Let's look at the problem:

 >>> import array
 >>> a = array.array('B',[210])
  >>> a
 array('B', [210])
 >>> a[0]
 210
 >>> a[0] += 46
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 OverflowError: unsigned byte integer is greater than maximum

It's just saying you tried to store 256 in a byte. You can't
expect the array module to know what policy you want to use
in dealing with oversize numbers. For another person's application
it might be just as reasonable to clamp >255 at the max value of 255,
and clamp <0 at 0.

If you want to enforce modulo 256, you can:
 >>> a[0]
 210
 >>> a[0] = (a[0]+46)%256
 >>> a[0]
 0

Or do it with a bit mask and bitwise '&' operation:
 >>> a[0]=210
 >>> a[0]
 210
 >>> a[0] = (a[0]+46)&255
 >>> a[0]
 0

Or you could write yourself a wrapper to hide that stuff just for
indexed single item access:

 >>> class A256:
 ...     def __init__(self, byte_arr): self.a = byte_arr
 ...     def __getitem__(self, i): return self.a[i]
 ...     def __setitem__(self, i, v):
 ...         self.a[i] = v&255
 ...     def __len__(self): return len(self.a)
 ...

Now make the array as usual
 >>> import array
 >>> a = array.array('B',[210,46,0,255])

Wrap it
 >>> w = A256(a)

Try some indexed sequence things
 >>> w[0]
 210
 >>> len(w)
 4
 >>> [x for x in w]
 [210, 46, 0, 255]

Note that we can see the original
array as an attribute of the wrapper
 >>> w.a
 array('B', [210, 46, 0, 255])

And the same thing directly, since we
kept a separate 'a' binding
 >>> a
 array('B', [210, 46, 0, 255])

Add some recognizable stuff using array methods
(that aren't available as methods of this wrapper)
 >>> w.a.fromstring('<ABC>')

Notice that 'a' refers to the same
 >>> a
 array('B', [210, 46, 0, 255, 60, 65, 66, 67, 62])
 >>> a.tostring()
 '\xd2.\x00\xff<ABC>'

Now your complaint:
 >>> a[0] +=46
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 OverflowError: unsigned byte integer is greater than maximum

Which is what it was designed to do, but we now have a wrapper
 >>> w[0] +=46
 >>> w[0]
 0

Which gave the "correct" result, and is also visible through 'a'
 >>> a[0]
 0

Look at another item both ways
 >>> a[1],w[1]
 (46, 46)

Add with no expected overflow
 >>> w[1]+=209
 >>> a[1],w[1]
 (255, 255)

Now expect overflow and "correction"
 >>> w[1]+=1
 >>> a[1],w[1]
 (0, 0)

See it in the array both ways
 >>> a
 array('B', [0, 0, 0, 255, 60, 65, 66, 67, 62])
 >>> w.a
 array('B', [0, 0, 0, 255, 60, 65, 66, 67, 62])
 >>>

Wrapping like that  will cost a little speed, but
you'll have to balance benefits vs ease etc. Unless
you're processing megabytes regularly, you probably
don't have to worry. If speed is a problem, there
will be ways to improve that.
>
>This is like many things in the Python library: they look 
>good at first sight but when you take a closer look at 
>them, you see, that...
... you have a great basis for solving almost any problem,
and all you need is some imagination to tailor them
to your specific requirements? ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list