[Tutor] Help! Character conversion from a rtf file.

bob gailer bgailer at gmail.com
Fri Jun 20 20:27:34 CEST 2008


Chien Nguyen wrote:
> Hi All,
> I am a newbie to Python. I just did some readings on the web
> and got some basic understanding about the language. I'd like
> to learn the language by writing some simple programs rather than
> keep reading books. My first program will convert certain uni-code 
> characters
> (let's say UTF-8) in an RTF file format based on a certain mapping
> in another RTF file that is called a "RTF Control file". On each line
> of the Control file, there are 2 tokens separate by a TAB or a space.
> The first token contains the character that needs to be converted from,
> and the second character contains the character that needs to be 
> converted to.
>
> The program will write to a new file that contains a new set of mapped 
> characters.
> If a character form the original file is not found in the Control 
> file, then the program
> just write the same character to the new file.
> For an example: The RTF Control file may contain the following lines.
>
> â     í
> ơ     ă
> ư     ổ
>
> The original RTF file may have something like
> tâc    mơm     thư    
>
> and will be converted to a new RTF file as follows.
> tíc     măm     thổ
>
> Before I start to go into the coding, I would like to get some advice 
> from
> experienced users/mentors about a quick way to do it.

Quick - do you mean time to code, or execution time?

For each line in the control file add an item to a dictionary, with old 
value as key and new value as value.
For each line in the data file
  For each character in the line
    if in dictionary replace with corresponding value
  write line to output.

-- 
Bob Gailer
919-636-4239 Chapel Hill, NC



More information about the Tutor mailing list