How far can stack [LIFO] solve do automatic garbage collection and prevent memory leak ?

Anton Ertl anton at
Thu Aug 19 15:45:07 CEST 2010

John Nagle <nagle at> writes:
>    In the superscalar era, there's not much of an advantage to avoiding
>stack accesses.

Apart from 4stack, I am not aware of a superscalar stack machine (and
4stack is more of an LIW than a superscalar).

OTOH, if by stack accesses you mean memory accesses through the stack
pointer on a register machine, then evidence contradicts your claim.
E.g., if we can keep one or two more of Gforth's VM's registers in
real registers rather than on the stack of an IA32 CPU, we see
significant speedups (like a factor of 2).

>x86 superscalar machines have many registers not
>visible to the program, as the fastest level of cache.

They have a data cache for memory accesses (about 3 cycles load-to-use
latency on current CPUs for these architectures), and they have rename
registers (not visible to programmers) that don't cache memory.  They
also have a store buffer with store-to-load forwarding, but that still
has no better load-to-use latency.

>In practice,
>the top of the stack is usually in CPU registers.

Only if the Forth system is written that way.

> The "huge number
>of programmer-visible register" machines like SPARCs turned out to be
>a dead end.

Really?  Architectures with 32 programmer-visible registers like SPARC
(but, unlike SPARC, without register windows) are quite successful in
embedded systems (e.g., MIPS, SPARC).

>So did making all the instructions the same width; it
>makes the CPU simpler, but not faster, and it bulks up the program
>by 2x or so.

In the beginning it also made the CPU faster.  As for the bulk, here's
some data from <2007Dec11.202937 at>; it's the
text (code) size of /usr/bin/dpkg in a specific version of the dpkg

 98132 dpkg_1.14.12_hurd-i386.deb
230024 dpkg_1.14.12_m68k.deb
249572 dpkg_1.14.12_amd64.deb
254984 dpkg_1.14.12_arm.deb
263596 dpkg_1.14.12_i386.deb
271832 dpkg_1.14.12_s390.deb
277576 dpkg_1.14.12_sparc.deb
295124 dpkg_1.14.12_hppa.deb
320032 dpkg_1.14.12_powerpc.deb
351968 dpkg_1.14.12_alpha.deb
361872 dpkg_1.14.12_mipsel.deb
371584 dpkg_1.14.12_mips.deb
615200 dpkg_1.14.12_ia64.deb

Sticking with the Linux packages (i.e., not the Hurd one), the range
in code size increase over the i386 code is 0.97 (ARM) to 1.41 (MIPS)
for the classical architectures with fixed-size instructions (RISCs).
Only the IA64 has a code size increase by a factor of 2.33.  Note that
code size is not everything that's in a program binary, and the rest
should be unaffected by whether the instructions are fixed-size or
variable-sized, so the overall effect on the binary will be smaller.

- anton
M. Anton Ertl
comp.lang.forth FAQs:
     New standard:
   EuroForth 2010:

More information about the Python-list mailing list