First two bytes of 'stdout' are lost

Thomas Passin list1 at tompassin.net
Thu Apr 11 09:19:09 EDT 2024


On 4/11/2024 8:42 AM, Olivier B. via Python-list wrote:
> I am trying to use StringIO to capture stdout, in code that looks like this:
> 
> import sys
> from io import StringIO
> old_stdout = sys.stdout
> sys.stdout = mystdout = StringIO()
> print( "patate")
> mystdout.seek(0)
> sys.stdout = old_stdout
> print(mystdout.read())
> 
> Well, it is not exactly like this, since this works properly
> 
> This code is actually run from C++ using the C Python API.
> This worked quite well, so the code was right at some point. But now,
> two things changed:
>   - Now using python 3.11.7 instead of 3.7.12
>   - Now using only the python limited C API
> 
> And it seems that now, mystdout.read() always misses the first two
> characters that have been written to stdout.
> 
> My first ideas was something related to the BOM improperly truncated
> at some point, but i am manipulating UTF-8, so the bom would be 3
> bytes, not 2.
> 
> I ruled out wrong C++ code to extract the string from the python
> variable, since running a python print of the content of mystdout in
> the real stdout also misses the two first characters.
> 
> Hopefully someone has a clue on what would have changed in Python for
> this to stop working compared to python 3.7?

I've not used the C API, so just for fun I asked ChatGPT about this and 
it suggested that a flush after writing to StringIO might do it.  It 
suggested using a custom class for this purpose:

class MyStringIO(StringIO):
     def write(self, s):
         # Override write method to ensure all characters are written 
correctly
         super().write(s)
         self.flush()

You would use it like this:

sys.stdout = mystdout = MyStringIO()

I haven't tested it but it seems reasonable, although I would have 
naively expected to lose bytes from the end, not the beginning.


More information about the Python-list mailing list