Generating valid identifiers

Steven D'Aprano steve+comp.lang.python at
Thu Jul 26 17:30:09 CEST 2012

On Thu, 26 Jul 2012 14:26:16 +0200, Laszlo Nagy wrote:

> I do not want this program to generate very long identifiers. It would
> increase SQL parsing time,

Will that increase in SQL parsing time be more, or less, than the time it 
takes to generate CRC32 or SHA hashsums and append them to a truncated 

> * Would it be a problem to use CRC32 instead of SHA? (Since security is
> not a problem, and CRC32 is faster.)

What happens if you get a collision?

That is, you have two different long identifiers:


which by bad luck both hash to the same value:


(or whatever).

> * I'm truncating the digest value to 10 characters.  Is it safe enough?
> I don't want to use more than 10 characters, because then it wouldn't be
> possible to recognize the original name. 

> * Can somebody think of a
> better algorithm, that would give a bigger chance of recognizing the
> original identifier from the modified one?

Rather than truncating the most significant part of the identifier, the 
field name, you should truncate the least important part, the middle.


goes to:


or similar.


More information about the Python-list mailing list