Raw strings as input from File?
Dave Angel
davea at ieee.org
Wed Dec 2 00:39:50 EST 2009
rzed wrote:
> utabintarbo <utabintarbo at gmail.com> wrote in
> news:adc6c455-5616-471a-8b39-d7fdad2179e4 at m33g2000vbi.googlegroups.c
> om:
>
>
>> I have a log file with full Windows paths on a line. eg:
>> K:\A\B\C\10xx\somerandomfilename.ext->/a1/b1/c1/10xx
>> \somerandomfilename.ext ; t9999xx; 11/23/2009 15:00:16 ;
>> 1259006416
>>
>> As I try to pull in the line and process it, python changes the
>> "\10" to a "\x08". This is before I can do anything with it. Is
>> there a way to specify that incoming lines (say, when using
>> .readlines() ) should be treated as raw strings?
>>
>> TIA
>>
>
> Despite all the ragging you're getting, it is a pretty flakey thing
>
When the OP specified readline(), which does *not* behave this way, he
probably deserved what you call "ragging." The backslash escaping is
for string literals, which are in code, not in data files.
In any case, there's a big difference between surprising (to you), and
flakey.
> that Python does in this context:
> (from a python shell)
>
>>>> x = '\1'
>>>> x
>>>>
> '\x01'
>
>>>> x = '\10'
>>>> x
>>>>
> '\x08'
>
> If you are pasting your string as a literal, then maybe it does the
> same. It still seems weird to me. I can accept that '\1' means x01,
> but \10 seems to be expanded to \010 and then translated from octal
> to get to x08. That's just strange. I'm sure it's documented
> somewhere, but it's not easy to search for.
>
>
Check in the help for "escape Strings". It's documented (in vers. 2.6,
anyway) in a nice chart that backslash followed by 3 digits, is
interpreted as octal. I don't like it much either, but it's inherited
from C, which has worked that way for 30+ years.
Online, see
http://www.python.org/doc/2.6.4/reference/lexical_analysis.html, and
look in section 2.4.1 for the chart.
> Oh, and this:
>
>>>> '\7'
>>>>
> '\x07'
>
>>>> '\70'
>>>>
> '8'
> ... is realy odd.
>
>
Octal 70 is hex 38 (or decimal 56), which is the character '8'.
DaveA
More information about the Python-list
mailing list