[Tutor] fake defrag revisited

Sat Oct 1 13:48:21 CEST 2011

On 10/01/2011 02:06 AM, R. Alan Monroe wrote:
> I'm revisiting the fake defrag program I posted about a few months
> ago. The concept is basically a screensaver or light show where you
> can enjoy watching entropy being reversed as colored blocks order
> themselves visually.
>
> I set it aside for a while because it was too slow, but I finally came
> up with a better algorithm for the simulated file creation&  deletion.
> So I can throroughly scramble my imaginary files.
>
> Now I want to come up with a simulated defragging, but I wish it for
> cosmetic reasons to visit the various areas of the drive in a way that
> appears random, to make it less boring.
>
> The fake "drive" is 64 blocks (shown as 8x8 grid onscreen) for test
> purposes.
>
> The files as-is (fragged):
> freelist:[1, 3, 6, 7, 9, 10, 11, 13, 14, 15, 16, 18, 22, 23, 25, 26, 27, 28, 30, 31, 32, 33, 34, 36, 38, 39, 40, 41, 42, 43, 44, 46, 47, 49, 52, 57, 58, 59, 60, 61, 62, 63]
> 1000: [45, 0]
> 1002: [56, 19, 5, 35, 8]
> 1014: [21, 54, 2, 20, 53, 17, 12, 4, 37, 48]
> 1013: [50, 51, 55, 24, 29]
>
> I can predict the arrangement of the files in the ideal end state
> (defragged) by sorting the filenames in ascending order then assign
> them blocks based on an incrementing number and the files' known
> sizes:
> freelist: [22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
> 1000: [0, 1]
> 1002: [2, 3, 4, 5, 6]
> 1013: [7, 8, 9, 10, 11]
> 1014: [12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
>
>
> Which gives this list of before and after blocks: i.e. block 0 is
> destined to live at 1 after defrag, block 1 is destined to live at 22,
> etc.
>
> [(0, 1), (1, 22), (2, 14), (3, 23), (4, 19), (5, 4), (6, 24), (7, 25),
> (8, 6), (9, 26), (10, 27), (11, 28), (12, 18), (13, 29), (14, 30),
> (15, 31), (16, 32), (17, 17), (18, 33), (19, 3), (20, 15), (21, 12),
> (22, 34), (23, 35), (24, 10), (25, 36), (26, 37), (27, 38 ), (28, 39),
> (29, 11), (30, 40), (31, 41), (32, 42), (33, 43), (34, 44), (35, 5),
> (36, 45), (37, 20), (38, 46), (39, 47), (40, 48), (41, 49), (42, 50),
> (43, 51), (44, 52), (45, 0), (46, 53), (47, 54), (48, 21), (49, 55),
> (50, 7), (51, 8), (52, 56), (53, 16), (54, 13), (55, 9), (56, 2), (57,
> 57), (58, 58), (59, 59), (60, 60), (61, 61), (62, 62), (63, 63)]
>
> I initially thought I could just do a random.shuffle on this list to
> achieve the cosmetic randomness, until I realized the real problem is
> magically determining the correct sequence in which to perform the
> moves without ruining a future move inadverently.
>
> If I move 0-to-1 first, I've now ruined the future 1-to-22 which ought
> to have taken place in advance.
>
Not true.  In most sorts, data is moved multiple times.  The only real 
constraint on the moves is the end point, which is a sorted form.
> Is there a deterministic-yet-seemingly-random algorithm out there
> whose name I wasn't aware of to be able to google it?
>
> Alan
>
I'm assuming the purpose is NOT to produce a good (ie. fast) algorithm 
for the actual moving of data on a disk drive, but rather to produce a 
pretty display. For one thing, you've pre-defined what order the free 
space blocks are going to be, even though that doesn't matter.

For a deterministic algorithm, simply sort that list of tuples, based on 
the second item.  The sort will swap two tuples each time, and they will 
gradually become more ordered.  Whether this will appear visually random 
depends partly on what your initial order is, and on which sort 
algorithm you implement.  If you use the builtin sort (by supplying your 
own compare function callback), you can add the visual swap each time 
your comparator will return true.  Or you could use a bubble sort, which 
won't generally look as random.

For another, non-deterministic approach, since you've already determined 
the final order, start by randomly choosing a cell.  If the cell is in 
the right place, continue picking randomly until you've found a cell 
that's in the wrong place.  Then swap it with its destination location, 
which clearly is in the wrong place.  Now if that one isn't in the right 
place (after swapping), do another swap to the location it belongs.  
Repeat until the swap makes both ends correct.  At that point, check the 
count of how many cells have ended up in the correct place, and if there 
are still more, loop back to the random function.

You could make that deterministic by replacing the random with a simple 
linear search.  In that case, you're done when that linear search hits 
the end of the list.  But of course it wouldn't look as random to the 
end user.

If you are willing to eliminate deterministic entirely, then randomly 
pick two cells, calculate the distance each is from its correct 
location, and swap them if the sum of the distances would improve with 
the swap.  On this one it'd be tricky to directly know when to quit, so 
you could do a linear check for correctness every N loops through the 
random code.  I think this'd make a pretty display, probably because 
it's not optimal.

-- 

DaveA