# [Tutor] fake defrag revisited

Dave Angel d at davea.name
Wed Oct 5 08:15:49 CEST 2011

```On 10/01/2011 03:53 PM, R. Alan Monroe wrote:
>
>> You missed the rest of the paragraph.  Don't wait for the sort to
>> finish, you do the swapping in the compare function.
>
> Either I'm too dumb to understand your answer, or explained my
> original problem too poorly.
>
> Have you ever played the Rush Hour game where you're moving little
> cars around a parking lot to let the goal car escape?
>
> http://www.thinkfun.com/sites/default/files/images/product_library/rushhour/new/RushH-5000-LoResSpill.jpg?1294204060
>
> http://www.thinkfun.com/sites/default/files/images/product_howtoplay/rush-hour.jpg?1290652077
>
> i.e. you can't move the goal car until another car is out of the way,
> but you can't move THAT car until a third car is out of the way, etc.
>
> That's the problem I'm facing, and I don't think a sort will solve it.
> I'm wondering now if it will have to be more of a recursive
> backtracking thing, that will reject a sequence of moves that would
> have overwritten an as-yet-unmoved piece.
>
> I'm using 2.x, and yeah, I have used cmp lambda functions before. It's
> how I got the sorted-by-second-element list in my followup email.
>
> (requires pygame)
> http://javajack.dynalias.net/defrag/defrag008.py
>
> Alan
>

I never saw this message, since you didn't CC it to the python-tutor group.

Problem with recursive backtracking is that you have to be sure the
maximum recursive depth is manageable.

Don't use a lambda function, use a real def function.  (Lambdas are just
expressions, so they can't do real if statements). Your function will be
called by the sort function to compare two items.  And if they should be
swapped, it'll swap them after you return. So you do the swap on your
display to match what you know the sort is going to do.

Since all the moves are swaps, it'll be guaranteed to be in a sequence
that converges on the correct final order.  Will it be the minimum
number of moves? Definitely not.  But that wasn't a requirement, and if
it were, you wouldn't start by building that list of tuples.

I believe producing a fake cmp-function callback is the minimum code to
solve the problem, but not the minimum thinking.  But if you have
different constraints, tell us about them.  If you want speed, just
update the display to its final state.

If you're trying to realistically simulate an actual defrag program,
then design that program.  But such a program might not have enough ram
to build the table you're assuming, where you know before starting what
the final ordering will be.

If I were writing a real defragger, I'd have other costraints.  It'd
have to work with huge disks, and relatively small memory.   And it
would have the constraint that if it would crash part way through, at
most one file or metafile would be in an invalid state, and that file
would be readily recovered by a utility that you run afterwards.  If you
think that's too restrictive, because you should do a backup first, then
my defragger is easy.  Backup, erase the entire drive, and restore.
Fast and easy.  But notice it required 50% free space, since the backup
media had to be big enough to hold it all.

I can come up with several other algorithms, depending on assumptions.
For example, you examine a pair of blocks, and if the abs-sum of the
dstances they are from their targets will decrease by swapping, then do
so.  Otherwise, leave them alone, and pick a different pair.
Independently keep track of the sum of all those distances.  When that
sum reaches zero, you're done.  Now the only trick is systematically
picking pairs to process.

Another one, start by moving all files upwards, till all the free space
is at the beginning.  Then sort the files by size, reversed, and move
them towards the beginning.  As long as there is enough free space to
store the biggest file, this will converge in one double pass.
Otherwise, it must be repeated.  Trick is that you define the disk as
just a little smaller next pass, starting wherever the first file didn't
fit.  So after this next pass, about twice as much of that first file
will be contiguous.  So the number of passes is no more than max file
size divided by free space.

--

DaveA
```