Overprinting

Fredrik Lundh effbot at telia.com
Sat Feb 5 18:14:47 EST 2000


Thierry Lalinne <thierry.lalinne at chello.be> wrote:
> I use an LPD daemon on my NT 4 box to receive spool jobs coming from an
IBM
> AS/400.  I need to parse the incoming file and generate a report in RTF.
> Unfortunately, the AS/400 uses overprinting on some lines, I get CR
without
> LF. What I would like to do is to use a little Python to get rid of the CR
> and merge the lines together.
>
> Example:
> Bla Bla Bla..........Bla Bla Bla<CR>............12345..........<CR><LF>
> Which after processing would give me:
> Bla Bla Bla...12345.....Bla Bla Bla<CR><LF>

here's one way to do it.  cannot think of something more
efficient right now, but I'm sure Tim or Christian will come
up with something...

import array, string

# character that may be overwritten
BLANKS = " ."

def fixup(line):
    # merge overlapping pieces
    pieces = string.split(line, "\r")
    if not pieces:
        return ""
    a = array.array("c", pieces[0])
    for s in pieces[1:]:
        n = len(a)
        if len(s) > n:
            a[n:] = s[n:] # tail
        # handle overlapping part
        n = min(n, len(s))
        for i in range(n):
            if s[i] not in BLANKS:
                a[i] = s[i]
    return a.tostring()

#
# try it out

text = "Bla Bla Bla..........Bla Bla Bla\r............12345..........\r\n"

for line in string.split(text, "\n"):
    print repr(fixup(line))

</F>





More information about the Python-list mailing list