[Python-Dev] Reading Python source file

Serhiy Storchaka storchaka at gmail.com
Tue Nov 17 10:40:37 EST 2015


On 17.11.15 05:00, Guido van Rossum wrote:
> If you free the memory used for the source buffer before starting code
> generation you should be good.

Thank you. The buffer is freed just after the end of generating AST.

> On Mon, Nov 16, 2015 at 5:53 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>> I'm working on rewriting Python tokenizer (in particular the part that reads
>> and decodes Python source file). The code is complicated. For now there are
>> such cases:
>>
>> * Reading from the string in memory.
>> * Interactive reading from the file.
>> * Reading from the file:
>>    - Raw reading ignoring encoding in parser generator.
>>    - Raw reading UTF-8 encoded file.
>>    - Reading and recoding to UTF-8.
>>
>> The file is read by the line. It makes hard to check correctness of the
>> first line if the encoding is specified in the second line. And it makes
>> very hard problems with null bytes and with desynchronizing buffered C and
>> Python files. All this problems can be easily solved if read all Python
>> source file in memory and then parse it as string. This would allow to drop
>> a large complex and buggy part of code.
>>
>> Are there disadvantages in this solution? As for memory consumption, the
>> source text itself will consume only small part of the memory consumed by
>> AST tree and other structures. As for performance, reading and decoding all
>> file can be faster then by the line.
>>
>> [1] http://bugs.python.org/issue25643
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>




More information about the Python-Dev mailing list