[Tutor] Possible Useless Python Challenge. [huffman encoding]
Tesla Coil
tescoil@irtc.net
Tue, 08 May 2001 10:20:01 -0500
On 7 May 2001, Daniel Yoo wrote:
> What Huffman realized is that we're not taking advantage
> of a certain curious property about human language: we=20
> use certain words over and over, certain letters over=20
> and over, and our language is fraught with repetitious
> repetition. What if we don't always represent a=20
> character by 8 bits, but by something that depends
> on how frequently the letter occurs in our language?
> Instead of having every letter be represented by=20
> eight bits, we can make a "variable length" encoding.
> That is, perhaps we can use the following code:
>
> 'h' =3D=3D=3D> 0000
> 'w' =3D=3D=3D> 0001
> 'd' =3D=3D=3D> 0010
> 'e' =3D=3D=3D> 0011
> 'o' =3D=3D=3D> 01
> 'r' =3D=3D=3D> 100
> ' ' =3D=3D=3D> 101
> 'l' =3D=3D=3D> 11
Incidentally, same principle used by an early
binary compression scheme known as Morse code:
'h' =3D=3D=3D> 0000
'w' =3D=3D=3D> 011
'd' =3D=3D=3D> 100
'e' =3D=3D=3D> 0
'o' =3D=3D=3D> 111
'r' =3D=3D=3D> 010
' ' =3D=3D=3D>
'l' =3D=3D=3D> 0100
"Plus =E7a change...," I guess.