[Python-Dev] Python3 "complexity"

Brett Cannon brett at python.org
Thu Jan 9 23:17:13 CET 2014


On Thu, Jan 9, 2014 at 5:00 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou <solipsis at pitrou.net>wrote:
>
>> > latin-1 guaranteed to work with any binary data, and round-trip
>> accurately?
>>
>> Yes, it is.
>>
>> > and will surrogateescape work for arbitrary binary data?
>>
>> Yes, it will.
>>
>
> Then maybe this is really a documentation issue, after all.
>
> I know I learned something.
>

I think the other issue is everyone is talking about keeping the data from
the file in a single object. If you slice it up into pieces and decode the
parts as necessary this also solves the issue. So if you had an HTTP header
you could do::

  raw_header, body = data.split(b'\r\n\r\n)
  header = raw_header.decode('ascii')  # Ort whatever HTTP headers are
encoded in.

Now that might not easily solve the issue of the ASCII text interspersed
(such as Kristján's "phone number in the middle of stuff" example), but it
will deal with the problem. And if the numbers were separated with clean
markers then this would probably still work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140109/466dde23/attachment.html>


More information about the Python-Dev mailing list