[Mailman-Developers] Plea for Help

Larry McVoy lm@bitmover.com
Wed, 13 Oct 1999 19:15:12 -0700


On Wed, Oct 13, 1999 at 06:40:15PM -0700, J C Lawrence wrote:
> On Wed, 13 Oct 1999 18:40:53 -0400 (EDT) 
> Barry A Warsaw <bwarsaw@cnri.reston.va.us> wrote:
> 
> > Thanks for the info JC.  The BK web pages seem pretty out of date
> > though, and I didn't see anything about downloads.  Interestingly
> > enough I also didn't see any links to the supposedly public free
> > software change logs :)
> 
> BK is not released *yet*.  Access is currently restricted to people
> who sign the beta agreement etc.  BK should be released fairly soon
> tho (Larry: What's the current predict?)

We are trying to do one last beta starting around this Friday and ship 
1.0 on or close to Nov 1.   LODs will not be complete but it is way past
the point that it is useful enough it is more a less a crime to keep it
hidden.  Just my opinion :-)

More seriously, Zack Weinberg, one of people who work here on BK, has been 
pointing out that we should have shipped a month ago, around the time that
VA started saying "No more betas, this works".  The reason being that while
we are busy polishing, there are some bugs that just aren't going to be
exposed until we let it go out to the masses.  I'm starting to agree.

> > Everything else sounds interesting.  I'd like to come up with a
> > solution that would eventually translate to all the projects we've
> > got here (JPython and Python) so I want to investigate Aegis, and
> > get Guido's feedback on things as well.
> 
> Please note that I'm explicitly biased here.  I've used and don't
> like CVS.  I'm aware of Aegis only intellectually.  Aegis follows a
> somewhat similar design model to BK, but with a much more
> overweaning and dictatorial (you will do things our way or you will
> not do anything so NYAH!) implementation.  Compared to the various
> SCM tools I've used at HP, IBM, SGI, etc I'm damn near in love with
> BK, so no brownie points for guessing which I'd recommend.

Aegis has a bunch of problems, which you'll run into when you try it. 
No NT support and as much as I despise NT, it is as much of a player as
Solaris, for example.  A bunch of other issues, most of which stem from 
one problem: Aegis is a layer on top of the revision system, it can sit
on SCCS or RCS or whatever.  That means that it can only do as well as 
the greatest common denominator of all of those systems.  Which isn't
a good place.

We rewrote all of SCCS from scratch and our implementation is
substantially better than RCS or the other SCCS implementations out there.
And we use all of the power of SCCS.  SCCS has a bad name, Tichy did
a great job badmouthing it when he did RCS, but the fact is that SCCS
is a far better file format that RCS.  It did have some problems, such
as no tag support, no permissions support, no pathname support, etc.,
but we fixed all of those (in a way which is backwards compat - if you
are a Sun machine, you can use Sun's SCCS to read/write our files).

There are some non-obvious things that SCCS can do that we are planning
to use extensively in BK for LOD support.  This is probably the feature
you will want the most over the long run.  Imagine multiple branches,
like the dev, stable, released branches.  Imagine being able to be
in a visual tool like http://www.bitkeeper.com/sccstool.html , and
being able to see all the changesets (a changeset is a lot like what
you might think of as a patch, just more formal).  Now imagine saying
"start up and show me the stable branch but hide the release branch,
and hide everything that is in the dev branch that I haven't explicitly
included or excluded into the stable branch".  The world just became a
less cluttered space.   Now imagine being able to browse each changeset
with http://www.bitkeeper.com/csettool.html and deciding which ones you
want and which ones you don't.   You can drag and drop the ones you want
onto the stable branch and they show up there.  And the next time you
run this setup, those changes are now hidden in the dev branch.

Does that make any sense?  It's sort of graphical branch management
which can be arbitrarily complex, with an arbitrary number of branches.
(well almost arbitrary, we currently limit you to 65535 branches :-)

The disk space required for all of those branches is proportional to the
patch size.  Another way to say this:  suppose you had 10 branches and you
had a 1GB change which lived in the outer most branch, then got included
into each inner branch, one at a time.  In RCS, you'd have a 10GB file.
In BK, you have a 1GB file plus about 2K of meta data.

it's cool stuff like that that we can do because we took the time to write
a good revision engine.  We looked hard at RCS and came to the conclusion
that it was just profoundly broken for the things that we wanted to do and
we couldn't fix it.

> Absolutely.  Some of the things I particularly like about BK:
> 
>   -- Cross platform.  It is identical under Linux, Solaris, IRIX,
> Windows NT, etc.  

Including ssh support - we got ssh server/client to work on NT.  That was
the last remaining issue.  Other than stuff like group/world permissions
and symlinks (which we support, you can have a file that was at different
times in its life a symlink, a regular file, a revision controlled MDBM
file), the NT world is identical to the Unix world.  Whoops - not true,
we also support integration with the Visual studio GUI so the NT version
actually has that over the Unix stuff.

>   -- Resolves 95% of the three way merge problem.

Man you ain't seen nothing yet.  We figured out how to handle the two
branch, merge only one way (old -> new, but not the other way), and not
have the 3 way diff go to the ever more distance GCA.  We can treat the
last merge point as a legitimate GCA.  And we save enough information
that we can find this closer GCA.  I went through and tried it both
ways on all the merges in the BK source tree itself.  On average 9/10
of merge conflicts went away.  This is amazingly cool, you have to see
it to believe it.

> Some comments I wrote offlist comparing CVS and BK:
> 
>   -- BK removes 95% of the merge problem.  That's enough alone to
> sell it to me.  BK's merge tools are not perfect, but they're a
> sight better than anything else I've seen.

You only have the two merge stuff.  We have a somewhat broken threeway merge
tool that you'll get in the next beta.

>   -- BK can be run over SSH.  CVS can run pserver connections over
> an SSH port forward, but its a pain (I've done it).  CVS over SSH
> performance also sucks.  BK over SSH is quite nippy.

Yeah it is.  I'm supportng the Linux/PPC folks and I have to resync to 
a machine in bum-f*ck New Mexico that has the lossiest T1 line I've ever 
had the misfortune to encounter.  BK rocks for that.  It transfers close
to the minimum that must be transfered.  I'm realy tickled with this, it is
completely reasonable to have remote modem access to huge projects.  It 
works.  It never plows through the remote tree to do a resync because we
only resync changesets so all we have to look at is the ChangeSet log to 
see what needs to come across the wire.  And we are currently lazy about 
that and transfer more data than we really need to.  But since that over
head is in the Kbyte area, we haven't gotten too excited about fixing it.
We know how and will some day.

>   -- BK handles binary files transparently.  CVS doesn't even
> pretend to.  While BK's handling is not great (it checks in the
> UUencoded copy), it does work.

Yeah, this sucks, in my opinion, in spite of your nice words.  I could at
the very least gzip and uuencode it (that is a supported mode, it just
isn't the default).

But the real answer is to have binary specific "plugins" which know how to
break apart binaries so you can get meaningful diffs.  

Binaries are a pain and our support of them, while not horrible, is not anywhere
near acceptable in my opinion.  Anther long term thing to fix.

>   -- BK handles branches fairly well, sorta.  I'm still waiting for
> Larry's promised LinesOfDevelopment code so I can get a handle on
> what he's trying to do there.  If it is what I understand it will be
> clean and fairly neat.  CVS OTOH is just painful once you have
> anything more than a single branch.

The current branch support in BK sucks dog doo.  It's non-existant, just basic
branches with no way to manage them.  That's not branch support.  The LOD stuff
will make your heart go flip-flop though.  yeah, it's all vaporware as far as
you are concerned (the first implementation just went into our dev tree), but
I swear to you that this will rock your world.   I have to get this to work
right or Linus won't even consider using BK and I want it for my own tree.
It's #! feature on the priority list.

Anyway, that's a lot of salesman blather.  Let me know if you have specific
questions.  I'll go check out mailman in the mean time.
-- 
---
Larry McVoy            	   lm@bitmover.com           http://www.bitmover.com/lm