problems with base64

Karl Pech KarlPech at users.sf.net
Sat Jul 10 19:27:07 CEST 2004


Hi all,

I'm trying to write a program which can read in files in the following
format:
sos_encoded.txt:
---
begin-base64 644 sos.txt
UGxlYXNlLCBoZWxwIG1lIQ==
---

and convert them to "clear byte code". For example if you take the file
sos_encoded.txt and use my program on it you should get the following:
sos.txt:
---
Please, help me!
---

Unfortunately if I try to convert files which didn't have any "human
readable text" when they were encoded and if these files are large (> 1.5MB)
I get back corrupted files.

This is the source of my program:
---
import string

def extract_base64(source):
  source_ = source
  all_chars = string.maketrans('','') # create 256-ASCII-char-table

  # delete all base64-chars from source-copy
  source_without_base64_signs = source_.translate(all_chars, string.letters+string.digits+"+/=")

  # delete all chars, which remained in the changed source-copy, from the first copy
  # and return this new copy
  # --> all base64-chars remain
  return source_.translate(all_chars, source_without_base64_signs)

def convert_to_8bits(source):
  base64_table = {'A' :  0, 'N' : 13, 'a' : 26, 'n' : 39, '0' : 52,
                  'B' :  1, 'O' : 14, 'b' : 27, 'o' : 40, '1' : 53,
                  'C' :  2, 'P' : 15, 'c' : 28, 'p' : 41, '2' : 54,
                  'D' :  3, 'Q' : 16, 'd' : 29, 'q' : 42, '3' : 55,
                  'E' :  4, 'R' : 17, 'e' : 30, 'r' : 43, '4' : 56,
                  'F' :  5, 'S' : 18, 'f' : 31, 's' : 44, '5' : 57,
                  'G' :  6, 'T' : 19, 'g' : 32, 't' : 45, '6' : 58,
                  'H' :  7, 'U' : 20, 'h' : 33, 'u' : 46, '7' : 59,
                  'I' :  8, 'V' : 21, 'i' : 34, 'v' : 47, '8' : 60,
                  'J' :  9, 'W' : 22, 'j' : 35, 'w' : 48, '9' : 61,
                  'K' : 10, 'X' : 23, 'k' : 36, 'x' : 49, '+' : 62,
                  'L' : 11, 'Y' : 24, 'l' : 37, 'y' : 50, '/' : 63,
                  'M' : 12, 'Z' : 25, 'm' : 38, 'z' : 51, '=' : 0}

  result_ = []

  # fill an integer with four 6-bit-blocks from left to right
  box_ =  int( (base64_table[source[0]] << 26)\
             + (base64_table[source[1]] << 20)\
             + (base64_table[source[2]] << 14)\
             + (base64_table[source[3]] <<  8) )

  # get 8-bit-blocks out of the integer starting with the first 6-bit-block we have
  # inserted plus the two highest bits from the second 6-bit-block
  result_ += chr((box_ >> 24) & 255) + chr((box_ >> 16) & 255) + chr((box_ >> 8) & 255)

  # strip possible zeros from decoded result
  del result_[len(result_)-source.count('='):len(result_)]

  return result_

#open source file in binary-mode
fsource = open(raw_input("Please specify the source file that should be decoded: "), "rb")

# read in first line of the file and split it in 2+n "whitespace-blocks"
_1stline = fsource.readline().split()

# delete the first two blocks ("begin ..." and "644 ...")
del _1stline[0:2]

# join the other blocks to the target-filename
targetname = string.join(_1stline)
ftarget = open(targetname, "wb")

# read in the remainder of the file in 4-byte-blocks and write the results in 3-byte-blocks
# into the target file

while 1 == 1:
  source = ''
  while len(source) < 4:
    source += fsource.read(4)
    if source == '':
      break

    # reduce byte-code to base64-chars
    source = extract_base64(source)

  if source == '':
    break

  # convert 6-bit-blocks to 8-bit-blocks
  clear_text = convert_to_8bits(source)

  ftarget.writelines(clear_text)

ftarget.close()
fsource.close()

print "file "+targetname+" has been written!"
---

Unfortunately I can't use python's standard base64-module since
this whole task is an exercise. :(

And I don't see any logical problems in my code. I think I really
need some more eyes to watch over this. So you are my "last hope"! ;)
Perhaps you can give me a hint.

Thank you very much!

Regards
Karl




More information about the Python-list mailing list