Snowball to Python compiler

Stefan Behnel stefan_ml at behnel.de
Fri Apr 22 03:50:04 EDT 2011


Terry Reedy, 22.04.2011 05:48:
> On 4/21/2011 8:25 PM, Paul Rubin wrote:
>> Matt Chaput writes:
>>> I'm looking for some code that will take a Snowball program and
>>> compile it into a Python script. Or, less ideally, a Snowball
>>> interpreter written in Python.
>>>
>>> (http://snowball.tartarus.org/)
>>>
>>> Anyone heard of such a thing?
>>
>> I never saw snowball before, it looks kind of interesting, and it
>> looks like it already has a way to compile to C. If you're using
>> it for IR on any scale, you're surely much better off using the C
>> routines with a C API wrapper,
>
> If the C routines are in a shared library, you should be able to write the
> interface in Python with ctypes.

Since it appears that the code has to get compiled anyway, Cython is likely 
a better option, as it makes it easier to write a fast and Pythonic wrapper.

 From a quick look, Snowball also has a "-widechar" option that could allow 
interfacing directly with Python's Unicode strings in 16-bit Unicode builds 
(but not 32-bit builds!). That would provide for really fast wrappers that 
do not even need an intermediate encoding step. And PEP 393 would 
eventually allow to include both a UTF-8 and a 16-bit version of the 
(prefixed) Snowball code, and to use them alternatively, depending on the 
internal layout of the processed string, with the obvious fallback to UTF-8 
encoding only for strings that really exceed the lower 16-bit Unicode range.

That sounds like a really nice project.

Stefan




More information about the Python-list mailing list