newbie raw text question

Ian Sparks Ian.Sparks at
Tue Feb 4 15:44:26 CET 2003

Thanks for the reply Dennis. Your breakdown of the meaning of the RTF codes is pretty-much spot on. However, I'm still not "getting it". You say :

What escaped characters? The \ is a tag introducer (for lack of a 
better word) and is part of the actual data. "\rtf1" is NOT <cr>tf1. 

So here's a simple command-line test :

>>> print "\rtf1"

>>> print r"\rtf1"

Looks to me like \rtf1 *is* <cr>tf1 unless you define the string as a raw string and then it can contain the "\" character.

This is all very well for strings you define at the command line but what if a variable "x" contains "\rtf1" (NOT a raw string). Now how can you deal with it?

>>> print x

>>> print rx   #attempt to turn x into a raw string for printing.
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
NameError: name 'rx' is not defined

How can I print x as though it were a raw string? Like I said, its probably pretty obvious, I just don't "get it".

-----Original Message-----
From: Dennis Lee Bieber [mailto:wlfraed at]
Sent: Monday, February 03, 2003 11:33 PM
To: python-list at
Subject: Re: newbie raw text question

Ian Sparks fed this fish to the penguins on Monday 03 February 2003 
12:11 pm:

> I'm confused about this one. I'm reading some RTF formatted data from
> a database. The resulting string is :
> {\rtf1\ansi\ansicpg1252\deff0\deftab720{\fonttbl{\f0\fswiss MS Sans
> {Serif;}{\f1\froman\fcharset2 Symbol;}{\f2\fswiss Arial;}{\f3\fswiss
> {Arial;}} \colortbl\red0\green0\blue0;}
> \deflang1033\pard\plain\f3\fs16 Some text
> }
> obviously this is chock-full of escaped characters. I need to strip
> the RTF codes and all my regular expressions are expecting raw strings
> but I don't see a way of converting an escaped string to a raw string
> to use in the regex.
        What escaped characters? The \ is a tag introducer (for lack of a 
better word) and is part of the actual data. "\rtf1" is NOT <cr>tf1. 
What I see in your sample (and I've not studied RFT) is:

RTF version 1 (hypothetical this)
Codepage 1252
define font 0 (guessing) define tab 720 decipoints (1inch)(guessing, 
might be centipoints/0.1inch)
        font table
                font 0 "swiss" font (san serif) is MS San Serif
                font 1 "roman" font (serif) is character set 2 Symbol
                font 2 "swiss" font is Arial
                font 3 "swiss" font is Arial
        color table
                red 0
                green 0
                blue 0
define language 1033
plain (not bold or italic)
use font 3
font size 16

> There must be some way out of here...

