[Tutor] Re:Base 207 compression algorithm
Jeff Shannon
jeff@ccvcorp.com
Thu Jun 26 20:40:02 2003
cino hilliard wrote:
> Hi Jeff,
> You are not understanding what my program does. Have you tried it?
> This bas converter is my
> unique design allowing characters from ascii 48 - 255. So you will get
> ?? for 255 base 10 to base 16.
> It is a program by Declaration.
I understand very well what your program does. You are, however,
ascribing far more magic to unusual numerical bases than they actually
possess. Perhaps if you were to spend a bit of time studying assembly
language, you'd get a better feel for what's going on here. I don't
advocate actually using assembly language to program anything, but some
exposure to it will give you a much better idea of how your computer
actually works, even if you just look at 8086 assembler. For that
matter, a good exposition of C's variable types and how they work would
probably benefit you greatly. You seem to have no grasp of the
distinction between an integer and a character, and I've tried every
explanation I can think of with no effect.
> How do you get the size of s = 12345678987654321 in bytes?
> len(str(s)) = 17. Is that correct?
No, it's not. That's the length of the string of decimal digits that
represents s, which should be obvious since you explicitly convert s
into a string before taking its length. It is *not* the size of s in
bytes, because s is a (long) integer. I don't know the details of
Python longs well enough to calculate the number of bytes that that
particular number will require; I do know that every number up to
sys.maxint (2147483647, or 0x7fffffff -- the high bit is reserved as a
sign bit) is represented using a C long, i.e. four bytes. I suspect
that a Python long representing s, above will require either 8 or 12 bytes.
> how about the size of
> pi=31415926535897932384626433832795028841971693993751058209749445923078164062862089
>
> 98628034825342117067
> len(str(s)) = 17. Is that correct?
>
>> However, you're not going to have any luck in actually doing any math
>> with either of these strings.
>
> Sure you can. You convert back to decimal.
No, because your computer can't do math on a string of decimal digits.
It needs to convert that into a binary number somehow before it can do
math. And it can *store* it as a binary number a lot more efficiently
than it can store it as a string of digit characters, no matter what
encoding scheme you use for those characters.
Like I said, learn how your computer works at the level of registers,
and how the floating-point unit operates, and you'll understand this a
bit better.
>> -- you could probably get that much precision in less than a hundred
>> bytes (probably *much* less), compared to your 2100.
>
> Show me for just 100 digits.
> So you admit I have reduced the file size of 5000 digits of pi to 2100
> bytes of which I could
> read back into my program and convert back to 5000 digits decimal?
> (I'm not about to try to do the math to determine how many
> floating-point bits
>
> What are you talking about? Floating point goes to 16 digits or so
>
>>>> 355/113.
>>>
> 3.1415929203539825
Read up on the mathematics behind floating point numbers -- a decent C
compiler reference should have a fair bit in it about how the compiler
implements floats. You're *still* mistaking the representation that
Python is showing you for the number itself. I don't remember specifics
on the numbers of significant digits that are expressible with a
standard C float or double (I believe that Python floats are implemented
using C doubles), but that has *nothing* to do with how many digits
Python shows you.
>> If you really want to show compression, then take an arbitrary string
>> (say, the contents of your Windows' autoexec.bat or the contents of
>> your *nix /etc/passwd, or any generic logfile) and show how that can
>> be expressed in fewer
>
>
> Hello. Why can't I compress strings of numbers? In the world what is
> there are a lot of numbers.
> The latest record for Pi is 1.24 trillion digits. I will bet these
> digits are compressed and called from a decompressor.
Sure, you can "compress" strings of numbers, but if you want to do so in
a way that is reversible, you'll essentially have to encode each byte
(which is a number from 0-255) as a separate number, and there is *no
way* that a computer can represent a unique byte in less than one byte.
Compression algorithms are tricky things -- they look for patterns in
the arrangement of bytes, and then describe those patterns. This is a
far more complex task than simply converting a number into a different
radix.
And I bet that calculations of Pi *don't* use compression, except
possibly to store the final result. But calculations of Pi are a rather
specialized thing, and I can't recall any program that I've written that
needed to do that for a practical reason. For almost all of those,
math.pi (3.14159265359) is close enough, and if it's not, then I need
far more precise mathematical capabilities than what I'll get using
standard Python (or C) math routines.
At this point, I see no reason to continue this discussion. I've tried
explaining, as clearly as I can without breaking out the technical
manuals, how your computer handles numbers. Obviously, my explanations
aren't getting through to you. I can do no more, so I will not be
replying further in this thread.
Jeff Shannon
Technician/Programmer
Credit International