<div dir="ltr">Yes on #1 -- making the low-level functions more usable for edge cases by supporting bytes seems fine (as long as the support for strings, where it exists, is not compromised).<br><br>The status of pathlib is a little unclear to me -- is there a plan to eventually support bytes or not?<br>


<br>For #2 I think you should probably just work with the others you have mentioned.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Aug 23, 2014 at 9:44 PM, Nick Coghlan <span dir="ltr"><<a href="mailto:ncoghlan@gmail.com" target="_blank">ncoghlan@gmail.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">At Guido's request, splitting out two specific questions from Serhiy's<br>

thread where I believe we could do with an explicit "yes or no" from<br>

him.<br>

<br>

1. Should we accept patches adding support for the direct use of bytes<br>

paths in lower level filesystem manipulation APIs? (i.e. everything<br>

that isn't pathlib)<br>

<br>

This was Serhiy's original question (due to some open issues [1,2]). I<br>

think the answer is yes, as we already do in some cases, and the<br>

"pathlib doesn't support binary paths" design decision is a high level<br>

platform independent API vs low level potentially platform dependent<br>

API one rather than being about disallowing the use of bytes paths in<br>

general.<br>

<br>

[1] <a href="http://bugs.python.org/issue19997" target="_blank">http://bugs.python.org/issue19997</a><br>

[2] <a href="http://bugs.python.org/issue20797" target="_blank">http://bugs.python.org/issue20797</a><br>

<br>

2. Should we add some additional helpers to the string module for<br>

dealing with surrogate escaped bytes and other techniques for<br>

smuggling arbitrary binary data as text?<br>

<br>

My proposal [3] is to add:<br>

<br>

* string.escaped_surrogates (constant with the 128 escaped code points)<br>

* string.clean(s): replaces surrogates with '\ufffd' or another<br>

specified code point<br>

* string.redecode(s, encoding): encodes a string back to bytes and<br>

then decodes it again using the specified encoding (the old encoding<br>

defaults to 'latin-1' to match the assumptions in WSGI)<br>

<br>

"s != string.clean(s)" would then serve as a check for "does this<br>

string contain any surrogate escaped bytes?"<br>

<br>

[3] <a href="http://bugs.python.org/issue18814#msg225791" target="_blank">http://bugs.python.org/issue18814#msg225791</a><br>

<br>

Regards,<br>

Nick.<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Nick Coghlan   |   <a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>   |   Brisbane, Australia<br>

_______________________________________________<br>

Python-Dev mailing list<br>

<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-dev" target="_blank">https://mail.python.org/mailman/listinfo/python-dev</a><br>

Unsubscribe: <a href="https://mail.python.org/mailman/options/python-dev/guido%40python.org" target="_blank">https://mail.python.org/mailman/options/python-dev/guido%40python.org</a><br>

</font></span></blockquote></div><br><br clear="all"><br>-- <br>--Guido van Rossum (<a href="http://python.org/~guido">python.org/~guido</a>)

</div>