Dummy Decoder Example (was Re: Parallel decoding lesson for you.)

Skybuck Flying skybuck2000 at hotmail.com
Tue Sep 29 09:09:09 CEST 2015


Hello,

(It is now 29 september 2015)

As promised here is Skybuck's Parallel Universal Code demonstration program.

This posting contains a Delphi and C/C++ version for you to learn from.

(I was kinda thinking of adding some kind of case statement/array and out of 
order execution/randomization to proof that the lines can be executed in any 
order, though I didn't come around to that (yet) been playing World of 
Warships =D at 1 fps to 30 fps lol)
(If anybody doubt that these lines can be executed out of order I may 
eventually add such a feature as welll.. maybe I will do it even for the fun 
of it ! ;))
(I was also maybe thinking of trying to through it all into a nice 
Tprocessor class with an actual thread to show it off as well... didn't come 
around to that yet either ;):))
(However if you clearly examine the indexes used you will discover that 
there is no overlap, all indexes are uniquely read and such... no 
write-read-write-read-sequential stuff going on... it can all be read in 
parallel... except for building rowoffset array).

// ** Begin of Delphi Program ***

program TestProgram;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils;

{

Skybuck's Parallel Universal Code, Demonstration program

version 0.01 created on 19 september 2015

This demonstration program will illustrate how to decode fields of data in 
parallel.

For this demonstration program bit fiddling will be avoided by allowing 1 
bit per array index to illustrate the general idea.

Also lengths may be stored by 1 integer per array index. In a real world 
scenerio it would be Skybuck Universally Encoded.

Skybuck's Parallel Universal Code, Design Document:

version 1 created on 16 september 2015 (after seeing another mentioning of 
IBM's processor MIL which makes me sick, as if decoding fields in parallel 
is hard ! LOL ;) :))

(I also wrote a little introductionary question for it see other file for it 
and I even learned something from it: Parallel decoding lesson for you.txt)

All bits of the fields are split up in such a way that the first bit of each 
field is right next to each other, the second bit of each field is also next 
to each other, and so forth.

Conceptually view:

First currently situation

Field A consists out of a1a2a3a4      (4 bits)
Field B consists out of b1b2b3          (3 bits)
Field C consists out of c1                  (1 bit)
Field D consists out of d1d2d3d4d5d6  ( 6 bits)

These bits are stored as follows:

"First row":
a1b1c1d1

"Second row":
a2b2d2

"Third row":
a3b3d3

"Fourth row":
a4d4

"Fiveth row":
d5

"Sixth row":
d6

These rows will be stores sequentially as follows:
a1b1c1d1a2b2d2a3b3d4a4d4d5d6

Now the question is: How does a processor know where each row begins ? and 
how many bits of each field there is, the answers are given below:

The bit length of each row is stored preemptively/prefixed:

"First row": 4
"Second row": 3
"Third row": 3
"Fourth row": 2
"Fiveth row": 1
"Sixth row" : 1

Now for each row their offset can be computed:

First row starts at 0,
Second row starts at 0 + 4 = 4
Third row starts at 0 + 4 + 3 = 7
Fouth row starts at 0 + 4 + 3 + 3 = 10
Fiveth row starts at 0 + 4 + 3 + 3 + 2 = 12
Sixth row starts at 0 + 4+ 3 + 3+ 2 + 1 = 13

Let's check if this is true:

0   1   2   3   4   5   6   7   8   9   10 11 12 13
a1 b1 c1 d1 a2 b2 d2 a3 b3 d3 a4 d4 d5 d6

Bingo, all match.

The processor can compute the offsets of each row.

And thus the processor can reach each field in parallel as follows:

Read field A bit 0 at offset 0
Read field B bit 0 at offset 4
Read field C bit 0 at offset 7
Read field D bit 0 at offset 10

Now the question is how can the processor know when to stop reading ?

A marker/meta bit could be used like described in Skybuck's Universal Code 
version 1, which indicates if the field continues or stops.

These bits can be stored in the same way as the data bits above.

And thus for each data bit, a meta bit can be read as well.

This way the processor knows that C2 does not exist and can stop reading 
field C

For huffman codes that would not even be required, since the processor can 
see in the huffman tree when it reaches a leave/end node and then it will 
know the end of C was reached.

So marker/beta bits can be avoided by using Huffman codes ! Pretty neeto ! 
;) =D

However huffman has a drawback that some fields might get large, and may not 
be suited for rare information and modifications and so forth ! ;)

The last thing to do to make this usuable is to include another prefix field 
which indicates how many prefixes there are.

So final stream will look something like:

[Number of Fields][Set of Field Lengths][Set of Field Data Bits 
Intermixed/Parallel]

First field can be universally coded.
Second set of fields can be universally coded.
Third set of field contains data bits intermixed/in parallel.

Additional:

The problem can be solved by sorting the fields from largest to smallest 
field, so new stream looks like:
(sorting from smallest to largest would be possible too... but code below 
assumes first field is largest so it uses
largest to smallest sorting solution which is also applied to the stream A, 
so stream A below is sorted that way)

Stream A: 433211d1a1b1c1d2a2b2d3a3b3d4a4d5d6

Bye,
  Skybuck.


}

function Constrain( Para : integer ) : integer;
begin
    result := Para;
    if Para < 0 then Result := 0;
    if Para >= 1 then Result := 1;
end;

procedure Main;
// must put variables here otherwise won't show up in debugger.
const
    MaxProcessorCount = 4;
var
    // information stream, input
    Stream : array[0..20+(MaxProcessorCount-1)] of integer; // add max 
processor count to create a safe "padding" for reading so no out of 
bounds/range check errors with arrays.

    // bits representing fields of data
    a1,a2,a3,a4 : integer;
    b1,b2,b3 : integer;
    c1 : integer;
    d1,d2,d3,d4,d5,d6 : integer;

    // output
    RowIndex : integer;

    RowCount : integer;
    RowLength : array[0..5] of integer;
    RowOffset : array[0..5] of integer;
    FieldRowMultiplier : array[0..3,0..5] of integer;

    DataOffset : integer;

    FieldCount : integer;
    FieldLength : array[0..3] of integer;

    Processor : array[0..3] of integer;

    // debug fields
    FieldA : integer;
    FieldB : integer;
    FieldC : integer;
    FieldD : integer;
begin
    a1 := 1; a2 := 1; a3:= 1; a4 := 1;
    b1 := 1; b2 := 1; b3 := 1;
    c1 := 1;
    d1 := 1; d2 := 1; d3 := 1; d4 := 1; d5 := 1; d6 := 1;

    // compute input fields to compare it later with output fields
    FieldA := (a1) or (a2 shl 1) or (a3 shl 2) or (a4 shl 3);
    FieldB := (b1) or (b2 shl 1) or (b3 shl 2);
    FieldC := (c1);
    FieldD := (d1) or (d2 shl 1) or (d3 shl 2) or (d4 shl 3) or (d5 shl 4) 
or (d6 shl 5);

    // print field values
    writeln( 'FieldA: ', FieldA );
    writeln( 'FieldB: ', FieldB );
    writeln( 'FieldC: ', FieldC );
    writeln( 'FieldD: ', FieldD );
    writeln;

    // number of rows
    Stream[0] := 6;

    // row lengths
    Stream[1] := 4;
    Stream[2] := 3;
    Stream[3] := 3;
    Stream[4] := 2;
    Stream[5] := 1;
    Stream[6] := 1;

    // sorted information stream:
    // d1a1b1c1d2a2b2d3a3b3d4a4d5d6

    // data bits
    Stream[7] := d1;
    Stream[8] := a1;
    Stream[9] := b1;
    Stream[10] := c1;
    Stream[11] := d2;
    Stream[12] := a2;
    Stream[13] := b2;
    Stream[14] := d3;
    Stream[15] := a3;
    Stream[16] := b3;
    Stream[17] := d4;
    Stream[18] := a4;
    Stream[19] := d5;
    Stream[20] := d6;

    // now the decoding algorithm:

    // determine number of rows
    RowCount := Stream[0];

    // extract row lengths
    RowLength[0] := Stream[1];
    RowLength[1] := Stream[2];
    RowLength[2] := Stream[3];
    RowLength[3] := Stream[4];
    RowLength[4] := Stream[5];
    RowLength[5] := Stream[6];

    // determine field count
    FieldCount := RowLength[0]; // row[0] indicates number of fields.

    // let's assume first field is largest so it always consumes all row 
information.
    // should be 0 if negative, should be zero if zero, should be 1 if 
positive so and will do the trick to constaint it.
    // the -0, -1, -2, -3 represents subtracting the processor 
number/identify from it... to allow parallel processing ! ;)

    // contraint is wrong... hahga unnt.
    FieldRowMultiplier[0,0] := Constrain(RowLength[0]-0); // shoudl be set 
to one if larger.
    FieldRowMultiplier[0,1] := Constrain(RowLength[1]-0);
    FieldRowMultiplier[0,2] := Constrain(RowLength[2]-0);
    FieldRowMultiplier[0,3] := Constrain(RowLength[3]-0);
    FieldRowMultiplier[0,4] := Constrain(RowLength[4]-0);
    FieldRowMultiplier[0,5] := Constrain(RowLength[5]-0);

    // now second field may consume less if there are not enough bits.
    FieldRowMultiplier[1,0] := Constrain(RowLength[0]-1);
    FieldRowMultiplier[1,1] := Constrain(RowLength[1]-1);
    FieldRowMultiplier[1,2] := Constrain(RowLength[2]-1);
    FieldRowMultiplier[1,3] := Constrain(RowLength[3]-1);
    FieldRowMultiplier[1,4] := Constrain(RowLength[4]-1);
    FieldRowMultiplier[1,5] := Constrain(RowLength[5]-1);

    // now third field may consume less if there are not enough bits.
    FieldRowMultiplier[2,0] := Constrain(RowLength[0]-2);
    FieldRowMultiplier[2,1] := Constrain(RowLength[1]-2);
    FieldRowMultiplier[2,2] := Constrain(RowLength[2]-2);
    FieldRowMultiplier[2,3] := Constrain(RowLength[3]-2);
    FieldRowMultiplier[2,4] := Constrain(RowLength[4]-2);
    FieldRowMultiplier[2,5] := Constrain(RowLength[5]-2);

    // now fourth field may consume less if there are not enough bits.
    FieldRowMultiplier[3,0] := Constrain(RowLength[0]-3);
    FieldRowMultiplier[3,1] := Constrain(RowLength[1]-3);
    FieldRowMultiplier[3,2] := Constrain(RowLength[2]-3);
    FieldRowMultiplier[3,3] := Constrain(RowLength[3]-3);
    FieldRowMultiplier[3,4] := Constrain(RowLength[4]-3);
    FieldRowMultiplier[3,5] := Constrain(RowLength[5]-3);


    // now compute field lengths

    // not necessary to multiply anything just add them up ! ;) =D
    FieldLength[0] :=
        FieldRowMultiplier[0,0] +
        FieldRowMultiplier[0,1] +
        FieldRowMultiplier[0,2] +
        FieldRowMultiplier[0,3] +
        FieldRowMultiplier[0,4] +
        FieldRowMultiplier[0,5];

    FieldLength[1] :=
        FieldRowMultiplier[1,0] +
        FieldRowMultiplier[1,1] +
        FieldRowMultiplier[1,2] +
        FieldRowMultiplier[1,3] +
        FieldRowMultiplier[1,4] +
        FieldRowMultiplier[1,5];

    FieldLength[2] :=
        FieldRowMultiplier[2,0] +
        FieldRowMultiplier[2,1] +
        FieldRowMultiplier[2,2] +
        FieldRowMultiplier[2,3] +
        FieldRowMultiplier[2,4] +
        FieldRowMultiplier[2,5];

    FieldLength[3] :=
        FieldRowMultiplier[3,0] +
        FieldRowMultiplier[3,1] +
        FieldRowMultiplier[3,2] +
        FieldRowMultiplier[3,3] +
        FieldRowMultiplier[3,4] +
        FieldRowMultiplier[3,5];

    // though the field multipliers could come in handy later to read the 
bits ! nice ! ;) =D

    // for each row the offset must be calculated this can be done serially 
or by all processors for themselfes at the same time:
    // row zero starts after the number of rows which is indicated by the 
first stream value =D

    // first determine data offset properly ! ;) :) 1 for the row count + 
RowCount to skip over row lengths.
    DataOffset := 1 + RowCount;

    RowOffset[0] := DataOffset;
    RowOffset[1] := RowOffset[0] + RowLength[0];
    RowOffset[2] := RowOffset[1] + RowLength[1];
    RowOffset[3] := RowOffset[2] + RowLength[2];
    RowOffset[4] := RowOffset[3] + RowLength[3];
    RowOffset[5] := RowOffset[4] + RowLength[4];

    // now calculate and/ore detemrine length of each field

    // first determine number of fields


    // now that all row offsets are calculated it's possible to decode the 
stream into each processor, by each processor.

    // each processor knows it's own number/identitiy represented by the +0 
+1 +2 +3 down below:
    // the general idea is:
    // row[] here is RowOffset[]
{
    Processor[0] := 
Stream[Row[0]+0]+Stream[Row[1]+0]+Stream[Row[2]+0]+Stream[Row[3]+0]+Stream[Row[4]+0]+Stream[Row[5]+0];
    Processor[1] := 
Stream[Row[0]+1]+Stream[Row[1]+1]+Stream[Row[2]+1]+Stream[Row[3]+1]+Stream[Row[4]+1]+Stream[Row[5]+1];
    Processor[2] := 
Stream[Row[0]+2]+Stream[Row[1]+2]+Stream[Row[2]+2]+Stream[Row[3]+2]+Stream[Row[4]+2]+Stream[Row[5]+2];
    Processor[3] := 
Stream[Row[0]+3]+Stream[Row[1]+3]+Stream[Row[2]+3]+Stream[Row[3]+3]+Stream[Row[4]+3]+Stream[Row[5]+3];
}
    // however a processor should only include the bits of a stream if it's 
within the field's length for that we need
    // to compute each field length and here we have an oops ! ;) :) cannot 
compute field length if multiple fields
    // per row. or can we ?! ;) :)perhaps we can... we know that fields have 
at least 1 bit because of row 0
    // and we know which fields have 2 bits because of row 2 and so forth... 
and since all fields
    // are ditributed parallely... we don't actually need to know where 
their bits are.... cause they all packed lol...
    // we only need to know what max length is or something... though how 
can we dan be sure that a field ends up
    // wehere it needs to be... well we dont... a field can end up in any 
processor.

    // so how should it actually look like then... well as follows:


    // since the fields are stored as follows: A,B,C,D we know A ends up in 
processor 0, B in 1, C in 2, D in 3 and so forth.
    // thus... processor 0 can determine length of A by looking at row[0], 
row[1], row[2], row[3], row[4], row[5], row[6].
    // but how to know which count matches to who's field ?
    // is the length of row 5 for A or B or C or D ? we don't know do we ?! 
;)

    // now we should be able to solve the problem... we know the bits belong 
to the first few fields... cool ! ;) =D

    // now we should be able to solve it easily... by only including a bit 
from the stream if the multiplier is set to 1 ;) :)
    // and now only thing left to do is shifting the bits into proper 
position and or-ing them together ! ;)

    Processor[0] :=
        ((Stream[RowOffset[0]+0]*FieldRowMultiplier[0,0]) shl 0) or
        ((Stream[RowOffset[1]+0]*FieldRowMultiplier[0,1]) shl 1) or
        ((Stream[RowOffset[2]+0]*FieldRowMultiplier[0,2]) shl 2) or
        ((Stream[RowOffset[3]+0]*FieldRowMultiplier[0,3]) shl 3) or
        ((Stream[RowOffset[4]+0]*FieldRowMultiplier[0,4]) shl 4) or
        ((Stream[RowOffset[5]+0]*FieldRowMultiplier[0,5]) shl 5);

    Processor[1] :=
        ((Stream[RowOffset[0]+1]*FieldRowMultiplier[1,0]) shl 0) or
        ((Stream[RowOffset[1]+1]*FieldRowMultiplier[1,1]) shl 1) or
        ((Stream[RowOffset[2]+1]*FieldRowMultiplier[1,2]) shl 2) or
        ((Stream[RowOffset[3]+1]*FieldRowMultiplier[1,3]) shl 3) or
        ((Stream[RowOffset[4]+1]*FieldRowMultiplier[1,4]) shl 4) or
        ((Stream[RowOffset[5]+1]*FieldRowMultiplier[1,5]) shl 5);

    Processor[2] :=
        ((Stream[RowOffset[0]+2]*FieldRowMultiplier[2,0]) shl 0) or
        ((Stream[RowOffset[1]+2]*FieldRowMultiplier[2,1]) shl 1) or
        ((Stream[RowOffset[2]+2]*FieldRowMultiplier[2,2]) shl 2) or
        ((Stream[RowOffset[3]+2]*FieldRowMultiplier[2,3]) shl 3) or
        ((Stream[RowOffset[4]+2]*FieldRowMultiplier[2,4]) shl 4) or
        ((Stream[RowOffset[5]+2]*FieldRowMultiplier[2,5]) shl 5);

    Processor[3] :=
        ((Stream[RowOffset[0]+3]*FieldRowMultiplier[3,0]) shl 0) or
        ((Stream[RowOffset[1]+3]*FieldRowMultiplier[3,1]) shl 1) or
        ((Stream[RowOffset[2]+3]*FieldRowMultiplier[3,2]) shl 2) or
        ((Stream[RowOffset[3]+3]*FieldRowMultiplier[3,3]) shl 3) or
        ((Stream[RowOffset[4]+3]*FieldRowMultiplier[3,4]) shl 4) or
        ((Stream[RowOffset[5]+3]*FieldRowMultiplier[3,5]) shl 5);

    // *** STILL TO DO (solved by extending array): 
********************************
    // *** ^^^ may have to look into potential out of range problem ^^^ ****
    // *********************************************************************
    // for now it seems ok, also as long as stream array has 
+NumberOfProcessors scratch pad at end it may be ok ! ;) :)

    // one last problem might remain... the row offset may go out of 
range...
    // we could either use a scratch pad or solve this in another way.
    // I think it's best to leave it as as and perhaps make the stream a bit 
larger or omething let's see what happens.

    // print processor values.
    writeln( 'Processor[0]: ', Processor[0] );
    writeln( 'Processor[1]: ', Processor[1] );
    writeln( 'Processor[2]: ', Processor[2] );
    writeln( 'Processor[3]: ', Processor[3] );
    writeln;
end;

begin
    try
        Main;
    except
        on E: Exception do
            Writeln(E.ClassName, ': ', E.Message);
    end;
    ReadLn;
end.

// *** End of Delphi Program ***


// *** Begin of C/C++ Program ***


// ParallelDecodingCVersion.cpp : Defines the entry point for the console 
application.
//

// Full C version 0.01 created on 23 september 2015 by Skybuck Flying =D

#include "stdafx.h"

// Begin of Dummy Decoder Example

const int MaxProcessorCount = 4;

int Constrain( int Para )
{
    if (Para < 0) Para = 0;
    if (Para >= 1) Para = 1;
    return Para;
}

int _tmain(int argc, _TCHAR* argv[])
{
    // information stream, input
    int Stream[21+(MaxProcessorCount-1)]; // add max processor count to 
create a safe "padding" for reading so no out of bounds/range check errors 
with arrays.

    // bits representing fields of data
    int a1,a2,a3,a4;
    int b1,b2,b3;
    int c1;
    int d1,d2,d3,d4,d5,d6;

    // output
    int RowIndex;

    int RowCount;
    int RowLength[6];
    int RowOffset[6];
    int FieldRowMultiplier[4][6];

    int DataOffset;

    int FieldCount;
    int FieldLength[4];

    int Processor[4];

    // debug fields
    int FieldA;
    int FieldB;
    int FieldC;
    int FieldD;

    // input test 1
/*
    a1 = 1; a2 = 1; a3 = 1; a4 = 1;
    b1 = 1; b2 = 1; b3 = 1;
    c1 = 1;
    d1 = 1; d2 = 1; d3 = 1; d4 = 1; d5 = 1; d6 = 1;
*/

    // input test 2
    a1 = 1; a2 = 0; a3 = 0; a4 = 1;
    b1 = 1; b2 = 1; b3 = 0;
    c1 = 1;
    d1 = 1; d2 = 1; d3 = 0; d4 = 0; d5 = 1; d6 = 0;


    // compute input fields to compare it later with output fields
    FieldA = (a1) | (a2 << 1) | (a3 << 2) | (a4 << 3);
    FieldB = (b1) | (b2 << 1) | (b3 << 2);
    FieldC = (c1);
    FieldD = (d1) | (d2 << 1) | (d3 << 2) | (d4 << 3) | (d5 << 4) | (d6 << 
5);

    // print field values
    printf( "FieldD: %d \n", FieldD );
    printf( "FieldA: %d \n", FieldA );
    printf( "FieldB: %d \n", FieldB );
    printf( "FieldC: %d \n\n", FieldC );

    // number of rows
    Stream[0] = 6;

    // row lengths
    Stream[1] = 4;
    Stream[2] = 3;
    Stream[3] = 3;
    Stream[4] = 2;
    Stream[5] = 1;
    Stream[6] = 1;

    // sorted information stream:
    // d1a1b1c1d2a2b2d3a3b3d4a4d5d6

    // data bits
    Stream[7] = d1;
    Stream[8] = a1;
    Stream[9] = b1;
    Stream[10] = c1;
    Stream[11] = d2;
    Stream[12] = a2;
    Stream[13] = b2;
    Stream[14] = d3;
    Stream[15] = a3;
    Stream[16] = b3;
    Stream[17] = d4;
    Stream[18] = a4;
    Stream[19] = d5;
    Stream[20] = d6;

    // now the decoding algorithm:

    // determine number of rows
    RowCount = Stream[0];

    // extract row lengths
    RowLength[0] = Stream[1];
    RowLength[1] = Stream[2];
    RowLength[2] = Stream[3];
    RowLength[3] = Stream[4];
    RowLength[4] = Stream[5];
    RowLength[5] = Stream[6];

    // determine field count
    FieldCount = RowLength[0]; // row[0] indicates number of fields.

    // I will help out a bit... by leaving this code in ! ;) seems somewhat 
obvious ;)
    // first determine data offset properly ! ;) :) 1 for the row count +
    // RowCount to skip over row lengths.
    DataOffset = 1 + RowCount;

    RowOffset[0] = DataOffset;
    RowOffset[1] = RowOffset[0] + RowLength[0];
    RowOffset[2] = RowOffset[1] + RowLength[1];
    RowOffset[3] = RowOffset[2] + RowLength[2];
    RowOffset[4] = RowOffset[3] + RowLength[3];
    RowOffset[5] = RowOffset[4] + RowLength[4];

    // let's assume first field is largest so it always consumes all row 
information.
    // should be 0 if negative, should be zero if zero, should be 1 if 
positive so and will do the trick to constaint it.
    // the -0, -1, -2, -3 represents subtracting the processor 
number/identify from it... to allow parallel processing ! ;)

    // contraint is wrong... hahga unnt.
    FieldRowMultiplier[0][0] = Constrain(RowLength[0]-0); // shoudl be set 
to one if larger.
    FieldRowMultiplier[0][1] = Constrain(RowLength[1]-0);
    FieldRowMultiplier[0][2] = Constrain(RowLength[2]-0);
    FieldRowMultiplier[0][3] = Constrain(RowLength[3]-0);
    FieldRowMultiplier[0][4] = Constrain(RowLength[4]-0);
    FieldRowMultiplier[0][5] = Constrain(RowLength[5]-0);

    // now second field may consume less if there are not enough bits.
    FieldRowMultiplier[1][0] = Constrain(RowLength[0]-1);
    FieldRowMultiplier[1][1] = Constrain(RowLength[1]-1);
    FieldRowMultiplier[1][2] = Constrain(RowLength[2]-1);
    FieldRowMultiplier[1][3] = Constrain(RowLength[3]-1);
    FieldRowMultiplier[1][4] = Constrain(RowLength[4]-1);
    FieldRowMultiplier[1][5] = Constrain(RowLength[5]-1);

    // now third field may consume less if there are not enough bits.
    FieldRowMultiplier[2][0] = Constrain(RowLength[0]-2);
    FieldRowMultiplier[2][1] = Constrain(RowLength[1]-2);
    FieldRowMultiplier[2][2] = Constrain(RowLength[2]-2);
    FieldRowMultiplier[2][3] = Constrain(RowLength[3]-2);
    FieldRowMultiplier[2][4] = Constrain(RowLength[4]-2);
    FieldRowMultiplier[2][5] = Constrain(RowLength[5]-2);

    // now fourth field may consume less if there are not enough bits.
    FieldRowMultiplier[3][0] = Constrain(RowLength[0]-3);
    FieldRowMultiplier[3][1] = Constrain(RowLength[1]-3);
    FieldRowMultiplier[3][2] = Constrain(RowLength[2]-3);
    FieldRowMultiplier[3][3] = Constrain(RowLength[3]-3);
    FieldRowMultiplier[3][4] = Constrain(RowLength[4]-3);
    FieldRowMultiplier[3][5] = Constrain(RowLength[5]-3);


    // now compute field lengths

    // not necessary to multiply anything just add them up ! ;) =D
    FieldLength[0] =
        FieldRowMultiplier[0][0] +
        FieldRowMultiplier[0][1] +
        FieldRowMultiplier[0][2] +
        FieldRowMultiplier[0][3] +
        FieldRowMultiplier[0][4] +
        FieldRowMultiplier[0][5];

    FieldLength[1] =
        FieldRowMultiplier[1][0] +
        FieldRowMultiplier[1][1] +
        FieldRowMultiplier[1][2] +
        FieldRowMultiplier[1][3] +
        FieldRowMultiplier[1][4] +
        FieldRowMultiplier[1][5];

    FieldLength[2] =
        FieldRowMultiplier[2][0] +
        FieldRowMultiplier[2][1] +
        FieldRowMultiplier[2][2] +
        FieldRowMultiplier[2][3] +
        FieldRowMultiplier[2][4] +
        FieldRowMultiplier[2][5];

    FieldLength[3] =
        FieldRowMultiplier[3][0] +
        FieldRowMultiplier[3][1] +
        FieldRowMultiplier[3][2] +
        FieldRowMultiplier[3][3] +
        FieldRowMultiplier[3][4] +
        FieldRowMultiplier[3][5];

    // though the field multipliers could come in handy later to read the 
bits ! nice ! ;) =D

    // for each row the offset must be calculated this can be done serially 
or by all processors for themselfes at the same time:
    // row zero starts after the number of rows which is indicated by the 
first stream value =D

    // first determine data offset properly ! ;) :) 1 for the row count + 
RowCount to skip over row lengths.
    DataOffset = 1 + RowCount;

    RowOffset[0] = DataOffset;
    RowOffset[1] = RowOffset[0] + RowLength[0];
    RowOffset[2] = RowOffset[1] + RowLength[1];
    RowOffset[3] = RowOffset[2] + RowLength[2];
    RowOffset[4] = RowOffset[3] + RowLength[3];
    RowOffset[5] = RowOffset[4] + RowLength[4];

    // now calculate and/ore detemrine length of each field

    // first determine number of fields


    // now that all row offsets are calculated it's possible to decode the 
stream into each processor, by each processor.

    // each processor knows it's own number/identitiy represented by the +0 
+1 +2 +3 down below:
    // the general idea is:
    // row[] here is RowOffset[]
/*
    Processor[0] := 
Stream[Row[0]+0]+Stream[Row[1]+0]+Stream[Row[2]+0]+Stream[Row[3]+0]+Stream[Row[4]+0]+Stream[Row[5]+0];
    Processor[1] := 
Stream[Row[0]+1]+Stream[Row[1]+1]+Stream[Row[2]+1]+Stream[Row[3]+1]+Stream[Row[4]+1]+Stream[Row[5]+1];
    Processor[2] := 
Stream[Row[0]+2]+Stream[Row[1]+2]+Stream[Row[2]+2]+Stream[Row[3]+2]+Stream[Row[4]+2]+Stream[Row[5]+2];
    Processor[3] := 
Stream[Row[0]+3]+Stream[Row[1]+3]+Stream[Row[2]+3]+Stream[Row[3]+3]+Stream[Row[4]+3]+Stream[Row[5]+3];
*/
    // however a processor should only include the bits of a stream if it's 
within the field's length for that we need
    // to compute each field length and here we have an oops ! ;) :) cannot 
compute field length if multiple fields
    // per row. or can we ?! ;) :)perhaps we can... we know that fields have 
at least 1 bit because of row 0
    // and we know which fields have 2 bits because of row 2 and so forth... 
and since all fields
    // are ditributed parallely... we don't actually need to know where 
their bits are.... cause they all packed lol...
    // we only need to know what max length is or something... though how 
can we dan be sure that a field ends up
    // wehere it needs to be... well we dont... a field can end up in any 
processor.

    // so how should it actually look like then... well as follows:


    // since the fields are stored as follows: A,B,C,D we know A ends up in 
processor 0, B in 1, C in 2, D in 3 and so forth.
    // thus... processor 0 can determine length of A by looking at row[0], 
row[1], row[2], row[3], row[4], row[5], row[6].
    // but how to know which count matches to who's field ?
    // is the length of row 5 for A or B or C or D ? we don't know do we ?! 
;)

    // now we should be able to solve the problem... we know the bits belong 
to the first few fields... cool ! ;) =D

    // now we should be able to solve it easily... by only including a bit 
from the stream if the multiplier is set to 1 ;) :)
    // and now only thing left to do is shifting the bits into proper 
position and or-ing them together ! ;)

    Processor[0] =
        ((Stream[RowOffset[0]+0]*FieldRowMultiplier[0][0]) << 0) |
        ((Stream[RowOffset[1]+0]*FieldRowMultiplier[0][1]) << 1) |
        ((Stream[RowOffset[2]+0]*FieldRowMultiplier[0][2]) << 2) |
        ((Stream[RowOffset[3]+0]*FieldRowMultiplier[0][3]) << 3) |
        ((Stream[RowOffset[4]+0]*FieldRowMultiplier[0][4]) << 4) |
        ((Stream[RowOffset[5]+0]*FieldRowMultiplier[0][5]) << 5);

    Processor[1] =
        ((Stream[RowOffset[0]+1]*FieldRowMultiplier[1][0]) << 0) |
        ((Stream[RowOffset[1]+1]*FieldRowMultiplier[1][1]) << 1) |
        ((Stream[RowOffset[2]+1]*FieldRowMultiplier[1][2]) << 2) |
        ((Stream[RowOffset[3]+1]*FieldRowMultiplier[1][3]) << 3) |
        ((Stream[RowOffset[4]+1]*FieldRowMultiplier[1][4]) << 4) |
        ((Stream[RowOffset[5]+1]*FieldRowMultiplier[1][5]) << 5);

    Processor[2] =
        ((Stream[RowOffset[0]+2]*FieldRowMultiplier[2][0]) << 0) |
        ((Stream[RowOffset[1]+2]*FieldRowMultiplier[2][1]) << 1) |
        ((Stream[RowOffset[2]+2]*FieldRowMultiplier[2][2]) << 2) |
        ((Stream[RowOffset[3]+2]*FieldRowMultiplier[2][3]) << 3) |
        ((Stream[RowOffset[4]+2]*FieldRowMultiplier[2][4]) << 4) |
        ((Stream[RowOffset[5]+2]*FieldRowMultiplier[2][5]) << 5);

    Processor[3] =
        ((Stream[RowOffset[0]+3]*FieldRowMultiplier[3][0]) << 0) |
        ((Stream[RowOffset[1]+3]*FieldRowMultiplier[3][1]) << 1) |
        ((Stream[RowOffset[2]+3]*FieldRowMultiplier[3][2]) << 2) |
        ((Stream[RowOffset[3]+3]*FieldRowMultiplier[3][3]) << 3) |
        ((Stream[RowOffset[4]+3]*FieldRowMultiplier[3][4]) << 4) |
        ((Stream[RowOffset[5]+3]*FieldRowMultiplier[3][5]) << 5);

    // *** STILL TO DO (solved by extending array): 
********************************
    // *** ^^^ may have to look into potential out of range problem ^^^ **** 
// solved by adding padding for reading ;)
    // *********************************************************************
    // for now it seems ok, also as long as stream array has 
+NumberOfProcessors scratch pad at end it may be ok ! ;) :)


    // one last problem might remain... the row offset may go out of 
range...
    // we could either use a scratch pad|solve this in another way.
    // I think it's best to leave it as as and perhaps make the stream a bit 
larger or omething let's see what happens.

    // print processor values.
    printf( "Processor[0]: %d \n", Processor[0] );
    printf( "Processor[1]: %d \n", Processor[1] );
    printf( "Processor[2]: %d \n", Processor[2] );
    printf( "Processor[3]: %d \n\n", Processor[3] );

    return 0;
}

// *** Emd of C/C++ Program ***

Bye,
  Skybuck.



More information about the Python-list mailing list