Refactor/Rewrite Perl code in Python

Mon Jul 25 11:00:54 EDT 2011

Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:

> On Sun, Jul 24, 2011 at 7:29 PM, Shashwat Anand <anand.shashwat at gmail.com>
> wrote:
>
>> How do I start ?
>> The idea is to rewrite module by module.
>> But how to make sure code doesn't break ?
>
> By testing it.
>
> Read up on "test driven development".
>
> At this point, you have this:
>
> Perl modules: A, B, C, D
> Python modules: none
> Python tests: none
>
> Now, before you rewrite each Perl module in Python, first write a good,
> comprehension test suite in Python for that module. You need to have tests
> for each function and method. Test that they do the right thing for both
> good data and bad data. If you have functional requirements for the Perl
> modules, use that as your reference, otherwise use the Perl code as the
> reference.
>
> For example, this might be a basic test suite for the len() built-in
> function:
>
>
> for empty in ([], {}, (), set([]), ""):
>     if len(empty) != 0:
>         raise AssertionError('empty object gives non-zero len')
>
> for n in range(1, 5):
>     if len("x"*n) != n:
>         raise AssertionError('failure for string')
>     for kind in (list, tuple, set):
>         obj = kind([None]*n)
>         if len(obj) != n:
>             raise AssertionError('failure for %s' % obj)
>
> if len({'a': 1, 'b': None, 42: 'spam'}) != 3:
>     raise AssertionError('failure for dict')
>
> for bad_obj in (23, None, object(), 165.0, True):
>     try:
>         len(bad_obj)
>     except TypeError:
>         # Test passes!
>         pass
>     else:
>         # No exception means len() fails!
>         raise AssertionError('failed test')
>
>
>
> Multiply that by *every* function and method in the module, and you have a
> moderately good test suite for module A.
>
> (You may want to learn about the unittest module, which will help.)
>
> Now you have:
>
> Perl modules: A, B, C, D
> Python modules: none
> Python tests: test_A
>
> Now re-write the Python module, and test it against the test suite. If it
> fails, fix the failures. Repeat until it passes, and you have:
>
> Perl modules: A, B, C, D
> Python modules: A
> Python tests: test_A
>
> Now you can be confident that Python A does everything that Perl A does.
> Possibly *better* than the Perl module, since if you don't have a test
> suite for it, it probably has many hidden bugs.
>
> Continue in this way with the rest of the modules. At the end, you will have
> a full test suite against the entire collection of modules.

A very sane approach.

I currently support a large legacy application written in C++ by
extending it with Python.  We managed to do this by wrapping the legacy
application with a Python interface using the Boost::Python libraries.
It has allowed us to embrace and extend the legacy code and replace bits
of it one piece at a time while keeping the whole application running.

Now I am not aware of any Python bindings to the Perl interpreter, but
if there is some FFI library that makes it possible this approach might
be viable for your project.

The trade-off of this approach is the extra complexity of maintaining
another interface in your code.  The beneift is that you can avoid
re-writing a large amount of code up front.  This works particularly
well when the legacy system has a rather large user base and you don't
have a robust test suite for the legacy code.

One book I found particularly useful: 

http://www.amazon.ca/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

hth