<html><head><style type="text/css" media="screen">Body{font-family: Verdana;font-size:.75em;}h4{font-size:.9em;}a{color: #3a62a6;}.digest .toc {margin-bottom: 15px; padding-bottom:8px; border-bottom: 1px solid #ccc;}.digest .tocItem {margin-bottom: 15px;}.tocItem a{color:#000;text-decoration: none;}.tocItem a:hover{color: #3a62a6;text-decoration: underline;}.topic{padding-bottom: 8px;margin-bottom: 20px; border-bottom: 1px solid #ccc;}.topicHeader{margin-bottom:10px;}.topicTitle{font-weight: bold;}.replies p{margin:0;padding:0;}.replies hr{width: 15%;text-align: left;margin: 0 auto 5px 0;border: none 0;border-top: 1px solid #ccc;height: 1px;}.reply{margin-bottom: 6px;padding-bottom: 4px;}.anchorMarker{color: #3a62a6;}.footer{color: gray;}</style></head><body><div class="digest"><p>Hi ironpython,</p><p>Here's your Daily Digest of new issues for project "<a href="http://ironpython.codeplex.com/">IronPython</a>".</p><p>In today's digest:</p><h4>ISSUES</h4><div class="toc"><div class="tocItem"><a href="#toc_issue_1">1. <span class="tocTitle">[New issue] reading a file using codecs can fail</span> <span class="anchorMarker">↓</span></a></div><div class="tocItem"><a href="#toc_issue_2">2. <span class="tocTitle">[New comment] reading a file using codecs can fail</span> <span class="anchorMarker">↓</span></a></div></div><h4>ISSUES</h4><div class="topic"><a name="toc_issue_1"></a><div class="topicHeader"><span class="topicTitle">1. [New issue] reading a file using codecs can fail</span> <a href="http://ironpython.codeplex.com/workitem/34951">view online</a></div><p>User paweljasinski has proposed the issue:</p><p>"the following will fail given attached file:<br />
<pre><code>import codecs
lines = []
with codecs.open("text-utf8-with-bom.txt", encoding="utf-8-sig") as file_obj:
for line in file_obj: # fails here
lines.append(line)</code></pre>
exception:<br />
<pre><code>Traceback (most recent call last):
File "C:\Program Files (x86)\IronPython 2.7\Lib\encodings\utf_8_sig.py", line 100, in decode
File "decode-bug.py", line 5, in <module>
File "C:\Program Files (x86)\IronPython 2.7\Lib\codecs.py", line 684, in next
File "C:\Program Files (x86)\IronPython 2.7\Lib\codecs.py", line 615, in next
File "C:\Program Files (x86)\IronPython 2.7\Lib\codecs.py", line 530, in readline
File "C:\Program Files (x86)\IronPython 2.7\Lib\codecs.py", line 477, in read
UnicodeEncodeError: ('unknown', '\x00', 0, 1, 'failed to decode bytes at index 65')</code></pre>
It will not fail on linux/cpython.<br />
Removing one character from the first line of text-utf8-with-bom.txt file will make it work."</p></div><div class="topic"><a name="toc_issue_2"></a><div class="topicHeader"><span class="topicTitle">2. [New comment] reading a file using codecs can fail</span> <a href="http://ironpython.codeplex.com/workitem/34951">view online</a></div><p>User paweljasinski has commented on the issue:</p><p>"<p>a better test:<br>```<br>import codecs</p><p>f=open("text-utf8-with-bom.txt", "rb")<br>b=f.read()<br>codecs.utf_8_decode(b)</p><p># remove last character of the file (0x0d)<br>b=b[:-1]<br>print codecs.utf_8_decode(b)[1]</p><p># remove last character of euro code (0x0c)<br>b=b[:-1]<br>print codecs.utf_8_decode(b)[1]</p><p>```<br>produces:<br>```<br>70<br>Traceback (most recent call last):<br> File "d2_test.py", line 13, in <module><br>UnicodeEncodeError: ('unknown', '\x00', 0, 1, 'failed to decode bytes at index 65')<br>```<br>where on linux/cpython it is:<br>73<br>70</p><p></p>"</p></div><div class="footer"><p>You are receiving this email because you subscribed to notifications on CodePlex.</p><p>To report a bug, request a feature, or add a comment, visit <a href="http://ironpython.codeplex.com/workitem/list/basic">IronPython Issue Tracker</a>. You can <a href="http://ironpython.codeplex.com/subscriptions/workitem/project/edit">unsubscribe or change your issue notification settings</a> on CodePlex.com.</p></div></div></body></html>