Problem with uudecode

Wed May 26 03:40:30 EDT 2004

Juho Saarikko <sorry at but.no.spam> wrote:

>I made a Python script which takes Usenet message bodies from a database,
>decodes uuencoded contents and inserts them as Large Object into a
>PostGreSQL database. However, it appears that the to last few bytes
>of uudecoded data are always mangled. Take a look of this hexdump output:
>
>Originals (decoded with Pan, each line is from a different file):
>000c2c0 e1bf 00ff 2541 a9e4 a724 d9ff
>0011a10 ff54 00d9
>00093e0 fb4f a80d ffd9 c200 ffef 00d9
>
>Decoded by the script:
>000c2c0 e1bf 00ff 2541 a9e4 a724 d0ff
>0011a10 ff54 00d8
>00093e0 fb4f a80d ffd9 c200 ffef 00d8
>
>As you can see, one of the last two bytes gets altered in all cases.

As others have pointed out, it's really the last byte that is getting
altered.

>      for k in range(n+1, message.ntuples):
>#        print "Decodind row " + str(k)
>        s = message.getvalue(k, 0)
>        if s[:3] == "end":
>          n = k + 1
>          break
>        try:
>          body.append(binascii.a2b_uu(s))
>        except:
>          try:
>            bytes = (((ord(s[0])-32) & 63) * 4 + 3) / 3
>            body.append(binascii.a2b_uu(s[:bytes]))
>          except:
>            print "Broken attachment in message " + str(id)
>            conn.query("ROLLBACK")
>            return

Your computation of the number of bytes in the uuencoded string will come
up one short:  you're not accounting for the length byte.  That will have
exactly the effect you describe.  You lose the last encoded character,
which means you'll miss the last 6 bits of the file.  Change it to this:

            bytes = (((ord(s[0])-32) & 63) * 4 + 3) / 3 + 1

However, you should not need to wrap the first binascii.a2b_uu call with
try/except at all.  What is happening that causes the error in the first
place?  I suspect if you fix the root cause, you could eliminate the except
clause altogether.
-- 
- Tim Roberts, timr at probo.com
  Providenza & Boekelheide, Inc.