Problem with uudecode
Tim Roberts
timr at probo.com
Wed May 26 03:40:30 EDT 2004
Juho Saarikko <sorry at but.no.spam> wrote:
>I made a Python script which takes Usenet message bodies from a database,
>decodes uuencoded contents and inserts them as Large Object into a
>PostGreSQL database. However, it appears that the to last few bytes
>of uudecoded data are always mangled. Take a look of this hexdump output:
>
>Originals (decoded with Pan, each line is from a different file):
>000c2c0 e1bf 00ff 2541 a9e4 a724 d9ff
>0011a10 ff54 00d9
>00093e0 fb4f a80d ffd9 c200 ffef 00d9
>
>Decoded by the script:
>000c2c0 e1bf 00ff 2541 a9e4 a724 d0ff
>0011a10 ff54 00d8
>00093e0 fb4f a80d ffd9 c200 ffef 00d8
>
>As you can see, one of the last two bytes gets altered in all cases.
As others have pointed out, it's really the last byte that is getting
altered.
> for k in range(n+1, message.ntuples):
># print "Decodind row " + str(k)
> s = message.getvalue(k, 0)
> if s[:3] == "end":
> n = k + 1
> break
> try:
> body.append(binascii.a2b_uu(s))
> except:
> try:
> bytes = (((ord(s[0])-32) & 63) * 4 + 3) / 3
> body.append(binascii.a2b_uu(s[:bytes]))
> except:
> print "Broken attachment in message " + str(id)
> conn.query("ROLLBACK")
> return
Your computation of the number of bytes in the uuencoded string will come
up one short: you're not accounting for the length byte. That will have
exactly the effect you describe. You lose the last encoded character,
which means you'll miss the last 6 bits of the file. Change it to this:
bytes = (((ord(s[0])-32) & 63) * 4 + 3) / 3 + 1
However, you should not need to wrap the first binascii.a2b_uu call with
try/except at all. What is happening that causes the error in the first
place? I suspect if you fix the root cause, you could eliminate the except
clause altogether.
--
- Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the Python-list
mailing list