83 What is the history of Revison Control and Configuration Management?

Description

This article is from the Configuration Management Tools FAQ, by Dave Eaton dwe@arde.com with numerous contributions by others.

83 What is the history of Revison Control and Configuration Management?

This subject would take more room than is possible in the FAQ. An
abbreviated, though still rather lengthy, summary of recollections
from many contributors on the newsgroup is provided here for
reference.

As soon as "software" began being created, there was a need to
change it. The first "configuration management" was done manually.
(Have you ever saved a patch-panel board for use and comparison
later?)

As binary computers and their software grew, tools began to be
created to help manage the software and the changes to it. On the
mainframes, revision control systems were used early on as update
systems which typically combined manual editing plus revision
control plus some CM. Another branch was the hardware CM systems,
basically fancy bill of materials systems. A third branch of CM
were manual and semi-automated systems based on mil-specs. A fourth
branch of CM consists of the UNIX tool set utilities and their
clones.

In early cases, the source or binary of the programs were typed on
typerwriter-style machines and stored on physical media such as
punched paper tape and punched cards (yes, this was pre-video and
pre-magnetic media days - no file systems). Frequently there were
methods of punching leader or lead cards with patterns which could
be recognized and read by humans to identify the program and its
revision number or date by looking at the tape or card deck.
Complete copies of the paper tape or card decks were kept to enable
developers to return and maintain earlier versions. "Golden
releases" consisted of punching mylar tape rather than paper tape.
(Of course the mylar tape didn't get out of order if dropped and
wasn't erased by being placed near a magnet.)

As technology advanced, the physical media migrated to magnetic
media. Reels of tape were archived. The advent of smaller media
with larger capacity gave rise to the "floppy in the drawer" method
of version control, but version control was still manual in many
development shops.

The early software configuration management process was manual,
also. The "checkout" process often consisted of writing the
developer's name on a paper or blackboard next to the module name.
"Checkin" was accomplished by erasing the name. A more "modern"
manual process used items such as colored map pins in a cork board.
Each developer was assigned a pin color and their pin was placed in
successive boxes beside each module's name to migrate who had
rights to edit, load, and test a particular software module through
its development cycle.

(Aren't we glad we have tools that can do these tasks for us
today?)

In the late 60's Early 70's, Professor Leon Presser at University
of California Santa Barbara did a thesis on change and
configuration control. This concept was a response to a contract he
was working on with a defense contractor who made aircraft engines
for the Navy. As you can guess, the AirForce also wanted to
purchase that "exact" same engine, plus or minus about 14 million
modifications.

This requirement eventually grew into a commercially available
product in 1975 called Change and Configuration Control (CCC) which
was sold by the SoftTool corporation.

The mainframe update systems, of which IBM's IEB_UPDATE and CDC's
Update were the most important, accepted as input update decks (all
of these systems were card based) which were basically difference
sets, i.e., edit decks that said to insert code, delete code, and
replace code. (Line editors date back a ways but it wasn't until
the 1970's that they were integrated into the CM cycle.) A key
distinction between these systems and the SCCS/RCS style systems is
that the update sets always referenced insert and delete points in
terms of record identifiers (which did not change from version to
version) rather than line numbers as in file differencing systems.

Similar change code schemes were used for other systems in the
1970's to regenerate paper tape sources based upon the line or
record number where the change was required. The new paper tape
would then be read into the assembler or compiler to create the
binary and saved as the next "version" in the cycle.

By 1970, CDC update was an advanced product. IBM UPDATE was much
more primitive. Columns 73-80 were used for holding sequence
numbers; you could only insert between sequence numbers. It appears
to have started as a deck patch system dating back before the 7090;
we are talking early 1960's or even late 1950's here. The later
versions had a hierarchy of control; a control deck could specify
which updates were to be applied to which decks. In turn control
decks could select other control decks.

IBM's system was fairly clearly derived from patching (e.g., the
UNIX patch program) which was a common thing to do in early years,
both to source code and (perversely) to object code.

The most sophisticated of these early systems was CDC's update
which combined revision control, change sets, preprocessor
directives, and build management into one package, albeit with a
heavy FORTRAN slant. (The system continued on for quite some time
and eventually incorporated file differencing for delta
generation.)

There have been quite a variety of build managers. The venerable
"make" dates back to the early 1970's. Concurrent with "make" were
a number of quasi-expert build managers that were more or less
tailored to specific operating systems. These systems tended to
rely on knowledge of system conventions rather than description
files and were much more convenient that "make". Thus in IBM's
VM/CMS and in TOPS-20 one could simply issue a link command (or
equivalent) and the linker could figure out which files had to be
compiled and linked. The general weak point of these systems were
their OS and environment dependence. A specific weak point is that
they preceded the spread of "include directives" which make the
build management problem more complex.

One of the functions of CM is version archiving. Such systems also
have a long history, both in the mainframe world and in the
minicomputer world. The mainframe products, e.g., panvalet, tended
to be more sophisticated in the early years but by and large did
not keep up with the times.

The UNIX branch is the source of most of the current commercial CM
tools, most of which got started in the 1980's. A notable exception
is CCC which started out as a mainframe CM product. The predecessor
to TrueChange started out as a cross-platform minicomputer CM
product.

The free UNIX line of tools began in the mid 1970's and includes
SCCS [Roc75], RCS [Tic82], CVS. SCCS and RCS are file versioning
systems; as such they are utilities in a CM system. At a minimum a
CM system has to manage collections of files. CVS was later
extended to include more of the functions required of a CM system,
though not all.

Basically SCCS interleaves directives (delta identifiers and insert
and delete directives) in with the code. There are no absolute
identifiers as such but they are deducible. CDC update
straightforwardly identified a record by its originating cset (the
term goes back to them) and the offset within cset (i.e., foo.100
was the 100'th record inserted by foo).

SCCS directives have to be nested within the file, i.e., a delete
segment cannot span inserts by different deltas but instead has to
be broken up into different delete segments.

The main point is that file differencing itself is line number
oriented which is a major limitation on using diff/patch. However a
VC utility which uses a file differencing utility can translate the
line numbers into absolute identifiers or their equivalent.

You can do delta selection in SCCS, but the procedure can be
incredibly cumbersome and error-prone.

RCS is a good revisioning engine. It has limitations when trying to
use it for a change based system. When code is created on a branch
then merged to the trunk, the new source is replicated on the trunk
delta, instead of being reused, like ADC, SCCS.

ClearCase's parent product was the early 1980's tool DSEE (Domain
Software Engineering Environment) from Apollo Computer. Unlike many
other tools of its day, DSEE used an interspersed delta file to
hold all versions in a single file. Rather than compute and apply
difference directives one after another to determine a particular
version, it made a single pass through the file and delivered the
correct lines to the requesting process. By the mid 1980's DSEE had
build management capabilities that included automatic dispatching
of component builds to remote machines on the network so that a
complete software subsystem could be created in parallel from a
single user command without modification to the build directives
(known as a model).

One of the things that a CM system has to handle is the
specification of a file set, i.e., a collection of files, each with
its own version. An early example of a system for doing this was
DEC's CMS which grouped versions of files into classes (CMS was
basically an upgrade of SCCS for VMS with some added bells and
whistles; MMS was a "make" clone.)

One of the complications in the UNIX branch was the use of
directory trees. (It may come as a shock to some readers, but there
are other ways to organize file systems.) Some issues are: (a) the
versioning of directory tree location of files and (b) handling the
existence within the file system of multiple versions of a file,
e.g., sandbox areas and system build areas. The ClearCase solution
from Atria/Rational was to intercept references to files within the
OS file management system. This is an elegant solution but is not
without problems.

The microcomputer revolution has added a twist. Many language
packages are offered as development environments with an elaborate
GUI front-end. Most of them include a crude CM system. (CM products
have tended to be rather crude in the non-UNIX PC world at the
conceptual level.) One of the notable occurences in the history of
software development technology is the idea of the development
environment. Sophisticated development environments are regularly
created and just as regularly they become dead ends. Unfortunately,
it has been a regular feature of these development environments
that CM is an add-on afterthought.

Additional information may be found in the background/history
section of Ron Berlack's "Software Configuration Management" book.
He reviews the whys and wherefores from a program management view
point which provides an understanding for the justifications for
using CM principles and practices.

As a side point, one of the things that messes up version control
systems is hard-wiring assumptions about naming conventions. Naming
conventions are critically important in CM systems. To do things
right, however, the naming policy must be configurable and must not
be hard-wired into the tools. ADC decoupled conventions from the
base engine. Conventions were used in the model layer, then passed
onto ADC. A good example of what not to do is the version numbering
in SCCS. Arguably the A: etc. in DOS is another good example of
what not to do.

Marc Rochkind for SCCS, Walter Tichy for RCS, Richard Harter for
ADC/TrueChange and David LeBlang for DSEE and ClearCase are but a
few of the numerous people who have contributed to the advancement
of CM and the CM technology over the years.

References:

[Roc75]
Marc J. Rochkind. The Source Code Control System. IEEE
Trransactions on Software Engineering, SE-1(4): 364-370,
December 1975

[Tic82]
Walter F. Tichy. Design, Implementation, and Evaluation of a
Revision Control System. Proceedings of the 6th International
Conference on Software Engineering, IEEE, September 1982

Continue to:

Headaches Begone! A Systemic Approach To Healing Your Headaches

Don't Let Your Bike Seat Ruin Your Sex Life Book

83 What is the history of Revison Control and Configuration Management?

Description

83 What is the history of Revison Control and Configuration Management?

Search

My Books

Discover