[Python-Dev] Reading Python source file
M.-A. Lemburg
mal at egenix.com
Tue Nov 17 04:59:06 EST 2015
On 17.11.2015 02:53, Serhiy Storchaka wrote:
> I'm working on rewriting Python tokenizer (in particular the part that reads and decodes Python
> source file). The code is complicated. For now there are such cases:
>
> * Reading from the string in memory.
> * Interactive reading from the file.
> * Reading from the file:
> - Raw reading ignoring encoding in parser generator.
> - Raw reading UTF-8 encoded file.
> - Reading and recoding to UTF-8.
>
> The file is read by the line. It makes hard to check correctness of the first line if the encoding
> is specified in the second line. And it makes very hard problems with null bytes and with
> desynchronizing buffered C and Python files. All this problems can be easily solved if read all
> Python source file in memory and then parse it as string. This would allow to drop a large complex
> and buggy part of code.
>
> Are there disadvantages in this solution? As for memory consumption, the source text itself will
> consume only small part of the memory consumed by AST tree and other structures. As for performance,
> reading and decoding all file can be faster then by the line.
A problem with this approach is that you can no
longer fail early and detect indentation errors et al. while
parsing the data (which may well come from a pipe).
Another related problem is that you have to wait for the full
input data before you can start compiling the code.
I don't think these situations are all that common, though,
so reading in the full source code before compiling it
sounds like a reasonable approach.
We use the same simplification in eGenix PyRun's emulation of
the Python command line interface and it has so far not
caused any problems.
> [1] http://bugs.python.org/issue25643
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Nov 17 2015)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________
2015-10-23: Released mxODBC Connect 2.1.5 ... http://egenix.com/go85
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/
More information about the Python-Dev
mailing list