Looking for Form Feeds

John Machin sjmachin at lexicon.net
Tue Jan 25 01:37:39 CET 2005


Greg Lindstrom wrote:
> Hello-
> I have a file generated by an HP-9000 running Unix containing form
feeds
> signified by ^M^L.  I am trying to scan for the linefeed to signal
> certain processing to be performed but can not get the regex to "see"

> it.  Suppose I read my input line into a variable named "input"
>
> The following does not seem to work...
> input = input_file.readline()

You are shadowing a builtin.

> if re.match('\f', input): print 'Found a formfeed!'
> else: print 'No linefeed!'

formfeed == not not linefeed????

>
> I also tried to create a ^M^L (typed in as <ctrl>Q M <ctrlQ> L) but
that
> gives me a syntax error when I try to run the program (re does not
like
> the control characters, I guess).  Is it possible for me to pull out
the
> formfeeds in a straightforward manner?
>

For a start, resolve your confusion between formfeed and linefeed.

Formfeed makes your printer skip to the top of a new page (form),
without changing the column position. FF, '\f', ctrl-L, 0x0C.
Linefeed makes the printer skip to a new line, without changing the
column position. LF, '\n', ctrl-J, 0x0D.
There is also carriage return, which makes your typewriter return to
column 1, without moving to the next line. CR, '\r', ctrl-M, 0x0A.

Now you can probably guess why the writer of your report file is
emitting "\r\f". What we can't guess for you is where in your file
these "\r\f" occurrences are in relation to the newlines (i.e. '\n')
which Python is interpreting as line breaks. As others have pointed
out, (1) re.match works on the start of the string and (2) you probably
don't need to use re anyway. The solution may be as simple as: if
input_line[:2] == "\r\f":

BTW, have you checked that there are no other control characters
embedded in the file, e.g. ESC (introducing an escape sequence), SI/SO
(change character set), BEL * 100 (Hey, Fred, the printout's finished),
HT, VT, BS (yeah, probably lots of that, but I mean BackSpace)?
HTH,
John




More information about the Python-list mailing list