[Tutor] code review please

Wed Dec 28 09:05:38 CET 2005

Eakin, W said unto the world upon 27/12/05 09:59 AM:
> Hello,
>     Although I've been coding in PHP and ASP and JavaScript for a couple of
> years now, I'm relatively new to Python. For learning exercises, I'm writing
> small Python programs that do limited things, but hopefully do them well.
> 
> The following program takes a text file, reads through it, and any word
> longer than four characters will have the internal letters scrambled, but
> the first and last letters of the word will remain unchanged. Here's what
> happened when I ran the program on a file called example.txt.
> 
> Before:
> This is a sample of text that has been scrambled, before and after.
> 
> After:
>  Tihs is a sapmle of txet taht has been sblrmcead, broefe and aetfr.
> 
> The code follows, so any comments and/or suggestions as to what I did right
> or wrong, or what could be done better will be appreciated.
> 
> thanks,
> William

Hi William,

I coded up an approach; no guarantees that it is anywhere near optimal :-)

I didn't bother with the file I/O portions. Also, it respects internal 
punctuation in compound-words and the like. It does not respect extra 
white-space in the sense that "A cat" and "A  cat" produce identical 
output.

Best,

Brian vdB

import random
from string import punctuation

tstring = '''
This is my test string for the scramble_words function. It contains lots
of punctuation marks like ')', and '?'--well, not lots: instead, enough!
Here's what happens when one occurs mid-word: punct#uation.'''

def scramble_words(a_string):
     '''returns a_string with all internal substrings of words randomized

     The scramble_words function respects punctuation in that a word is a
     maximal string with neither whitespace nor characters from 
punctuation.
     Each word is scrambled in the sense that the characters excepting the
     first and last character are randomized.
     '''
     output = []
     for sequence in a_string.split():
         chunks = punctuation_split(sequence)
         # appending the joined chunks prevents spurious spaces
         # around punctuation marks
         output.append(''.join([_scramble_word(x) for x in chunks]))
     output = ' '.join(output)
     return output

def punctuation_split(sequence):
     '''returns list of character sequences separating punctuation 
characters'''
     for mark in punctuation:
         sequence = sequence.replace(mark, ' %s ' %mark)
     return sequence.split(' ')

def _scramble_word(word):
     '''returns word with its internal substring randomized'''
     if len(word) < 4:
         return word
     middle = list(word[1:-1])
     random.shuffle(middle)
     return ''.join((word[0], ''.join(middle), word[-1]))

a = scramble_words(tstring)
print a