[pypy-dev] binascii.py in pure python

Fri Nov 26 17:35:06 CET 2004

Christian Tismer wrote:

> Florian Bauer wrote:
>
>> Christian Tismer wrote:
>>
>>> holger krekel wrote:
>>>
>>>> Hi Florian,
>>>> [Florian Bauer Thu, Nov 25, 2004 at 04:53:33PM +0100]
>>>>
>>>>> Hi there,
>>>>>
>>>>> I checked the list of missing C modules on the wiki pages.
>>>>> I could contribute some code for binascii.py. 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> nice.
>>>>
>>>>> Some time ago i started porting the C module to python.
>>>>> It's halfway done, but some functions are pretty usable and have 
>>>>> some test coverage.
>>>>> Right now, I don't have time to download pypy and play with it, so 
>>>>> I don't know if there are any issues with integrating my code. It 
>>>>> would be best if I could develop and test the module on CPython.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> that's just fine.  Actually it's faster <wink> to develop against 
>>>> CPython  and
>>>> not let the current PyPy interpret your application level code.  
>>>
>>>
>>>
>>>
>>> As an addition:
>>> I'm just busy figuring out how to make C modules which are implemented
>>> in Python easier to compile back to C.
>>> It turned out that there is only little to change if your
>>> implementation does not use fancy features like generators.
>>> If you can assume that
>>> - all your globals are constant after initialization
>>> - you don't use ints or longs larger than machine words
>>> - methods are constants and not shadowed by instance vars
>>> - exceptions are raised only if you provide a try statement
>>> - no generators
>>> - no imports of modules which don't obey these rules
>>> - use __all__ to report the exports
>>>
>>> then this module is almost ready to become a builtin module.
>>> I just have to convert the exported objects in __all__
>>> an give them an application-python interface, again.
>>>
>>> ciao - chris
>>
>>
>>
>> This is pretty trivial in the case of binascii.py. The interface 
>> assumes strings, not iterables, so there's no need for fancy stuff.
>
>
> Good.
>
>> What I'm thinking about is whether I should use regular expressions 
>> or not. 
>
>
> I looked over binascii.c and found no real reason to use regexen.
> If I would do it, I would probably take the C source and tweak it
> until it is Python.
>
That's what I'm doing :-)

> Or did you plan to do a re implementation? :-))

Rather not.

>> I haven't played around with it yet, but I guess that at least for 
>> running under CPython re.sub woud be faster than a state machine with 
>> the loop coded in python. But in pypy, maybe not. Any thoughts on 
>> this matter?
>
>
> Well,re.sub uses state machines as well, and re will finally be
> implemented in Python as well. So there will not be that difference.
>
> I you use state machines, write simple code and just think you
> are coding in C. Your code will later be translated into C,
> and it will be simplified to use machine words as much as possible.

Ok. Thats what my code looks like at the moment.

> Good luck with your thesis - chris
>
Thanks (also to Holger)!

Florian