Here's a hack that google turned up:
(1) Use static variables instead of dynamic (stack) variables
(2) Use in-line assembly code that explicitly aligns data
(3) In C code, use "malloc" to explicitly allocate variables
Here is Intel's example of (2):
; procedure prologue
push ebp
mov esp, ebp
and ebp, -8
sub esp, 12
; procedure epilogue
add esp, 12
pop ebp
ret
Intel's example of (3), slightly modified:
double *p, *newp;
p = (double*)malloc ((sizeof(double)*NPTS)+4);
newp = (p+4) & (~7);
This assures that newp is 8-byte aligned even if p is not. However,
malloc() may already follow Intel's recommendation that a
32-
byte or
greater data structures be aligned on a
32 byte boundary. In that case,
increasing the requested memory by 4 bytes and computing newp are
superfluous.
Chuck