lightweight encryption of text file
nobody at nowhere.com
Mon Jan 11 04:36:17 CET 2010
On Sun, 10 Jan 2010 08:54:51 -0800, Paul Rubin wrote:
> Nobody <nobody at nowhere.com> writes:
>> RC4 (aka ArcFour) is quite trivial to implement, and better than inventing
>> your own cipher or using a Vignere: ...
> That's a cute implementation, but it has no authentication and doesn't
> include any randomness, which means if you use the same key for two
> inputs, there is a security failure (xor'ing the two ciphertexts reveals
> the xor of the plaintexts).
Right. RC4 is a cipher, not a cryptosystem.
But, yeah, the OP needs to be aware of the difference (and probably isn't,
yet). So to take that a step further ...
The key passed to arcfour.schedule() shouldn't be re-used. If you need to
encrypt multiple files, use a different key for each. If you want to
encrypt multiple files with the same "password", generate a unique key by
hashing a combination of the password and a random salt (e.g. from
/dev/random), and prepend the salt to the beginning of the stream. To
decrypt, extract the salt from the stream to generate the key.
If you need to verify the data, append a hash of the ciphertext (a hash
of the plaintext would allow an attacker to confirm a guessed plaintext
or to confirm that two files contain the same plaintext). Stream ciphers
are vulnerable to replacement attacks:
(p1 xor r) xor (p1 xor p2) == (p2 xor r)
So if you can guess any part of the plaintext p1, you can replace it with
alternative plaintext p2 without needing to decrypt/encrypt or knowing
anything about the pad r.
Also, if this is for something important, I'd be concerned about how to
protect the key. That's hard enough to do in C, let alone in Python.
> It also looks rather slow.
Any kind of bulk binary data processing in pure Python is slow. The code
was written mainly for simplicity, e.g. using generators means that you
don't have to deal with buffer sizes. Replacing " % 256" with " & 255"
might be worthwhile.
> I don't make
> any guarantees about p3.py, but it has been reviewed by several experts
> and appears to be reasonably sound for the type of casual use being
> discussed here, and it is tuned for speed (given the implementation
> constraints). For more demanding purposes, you should use a more
> serious library like one of the OpenSSL wrappers.
The OP specifically wanted to avoid third-party libraries.
More information about the Python-list