Computer Design
A computer aided design and VLSI approach

Paul J. Drongowski

Chapter 6 - Design methodology.

The design of any computing system is a substantial
undertaking. This chapter discusses modular design and
conventions which make the design task more manageable.

Section 1 - Complexity and productivity.

In roughly one decade, the complexity of digital integrated
circuits has grown from 5,000 to over 275,000 transistors
per system. Figure 1 depicts this growth using microcomputer
devices from Intel to illustrate the point. The Intel 8080,
the first 8-bit microcomputer, was a 5,000 transistor design.
In that era, it was possible to draw each transistor by hand.
Chips were laid out on room-sized pieces of drafting paper
and checked by hand for correctness. By 1986, the 80386 was
designed and simulated using workstation and mainframe
computers. Although workstations have made the drafting task
manageable, the problems of functional design, validation
and testing have literally exploded into nearly unmanageable
complexity. The software industry has also experience such
geometric growth in complexity, perhaps sooner than hardware
engineers due to the ambitious goals of the space program
in the 1960's. Predictions indicate that future defense
systems (e.g., the Strategic Defense Initiative) may require
hundreds of millions of lines of code.

Human productivity, even when it is assisted by computer-based
aids, is limited.  An experienced programmer can produce one
line of designed, debugged and documented code per hour.
A 100,000 transistor design takes about 40 person-years of
effort to develop. Given the magnitude of contemporary systems,
their development is well beyond the capabilities of a single
individual. One engineer cannot possibly or infallibly understand
every facet of a large system design.

    300K -
         |                               *80386 (275K)
         |
         |
    200K -
         |
         |
         |                        *80286 (130K)
    100K -
         |
         |              *8086 (29K)
         |    *8080 (5K)
         -----|----|----|----|----|----|----|
             74   76   78   80   82   84   86

   Figure 1 - Rising complexity (1000's of transistors.)
     (Source: Solutions, Intel Corp., Sep/Oct 1987.)


The most obvious and critical implication of this analysis
is that system development must be a team effort. Development
projects are typically 18 to 24 months in duration. Thus,
engineering tasks must be assigned to team members and
co-ordinated by management. Team development is, of course,
affected by the normal social interaction between the group's
human members. With less than perfect communication, techniques
must be adopted to eliminate the ambiguities and errors which
will occur naturally in the design.

A formal architectural specification is one way to better
communication and it supports computer-aided design and
analysis, too. Mathematics is an unambiguous language that
all designers have in common. The development of the
specification and the design, however, must still be parceled
out to the team members. Enter modular design and the notion
of "designing in the large."

Section 2 - Modular design.

Most of us are readily familiar with designing or programming
"in the small." In this situation, we are given a small task
such as a system to manage a database of names and phone
numbers and are then expected to write a working program which
accomplishes that task. Alternatively, we could be asked to
design a system to control a traffic light with a finite state
machine controller as the expected product. Small tasks are
relatively easy to accomplish because they may be understood in
their entirety by the programmer or engineer. The behavior
of the product systems can be extensively studied and tested.
Ad hoc design and implementation procedures (poorly structured
control and data flow, sketchy documentation) may be followed,
but do not hamper the development effort solely because the
complexity of the problem can be handled intellectually by
one individual.

When confronted by a very large design problem (strategic
nuclear defense, very large scale integrated circuits), our
first inclination is to divide the problem into subproblems
and to attack each of those subproblems in turn. This divide
and conquer strategy takes an overwhelming problem and reduces
it to a collection of smaller, more manageable subproblems.
If the subproblems are too complex, they may be further
partitioned into still smaller problems. Some of these problems
will interact with their siblings and sometimes with their
parents.

This is the essence of modular design. We partition the overall
design into subunits called "modules." Each of the modules
performs one (or at most, a few) system functions. If the modules
correspond in a natural way to customer needs, requirements and
modules can be tracked in a database. After the system has been
constructed and customer acceptance is sought, the requirements
tracking database can show conclusively that technical and
contractual commitment were satisfied.

When system decomposition is begun, the designers are making a
rather informal assignment of system behavior to the individual
modules. A product in the early stages of design will often
appear as a block diagram where each of the modules is a block.
A few sentences or pages may be written about the function of
each module. The lines interconnecting the modules show any
communication that may take place between participating blocks.
The overall system structure is likely to resemble the
interrelationships between design subproblems.

The decomposition of complex behavior into modules is called
"top-down" design. It begins with system behavior and requirements
as a whole and eventually reduces the design to primitive
building blocks. Top down design results in a hierarchical
structure as each layer of the hierarchy is refined and composed
of modules in the next lower level of the hierarchy. The
structure may be a hierarchy instead of a tree since common types
of building blocks or modules may be shared by two or more
higher level design objects.

The assignment of behavior to modules and the allowable communications
between modules will have considerable impact upon system performance.
If temporally distant modules must intercommunicate on a frequent
basis, communication and hence, system operation will be slow. The
performance implications of high level design decisions cannot always
be anticipated for very large systems. Decomposition and refinement
may proceed to a detailed level only to find that performance
will be unacceptably poor.

For this reason, "bottom-up" design is also practiced. In
bottom-up design, building blocks are composed (combined) to
form more complicated modules. Performance can be accurately
estimated during bottom-up design because the primitive building
blocks at the leaves (bottom) of the hierarchy can be carefully
analyzed. The analytical results can be combined at successively
higher levels of the hierarchy until an estimate for aggregate
system performance is obtained.

Neither pure top-down or bottom-up design can ever be effective
in practice. Unless top-down designers are particularly clairvoyant,
performance goals will be missed. (This is where the "waffle theory"
of system development is useful. Plan to throw the first one away
to get a feel for the implementation problems.) Pure bottom-up
designers will not have the intellectual leverage of divide and conquer.
Practical design, therefore, is a "yo-yo." Periods of top-down
decomposition are followed by bottom-up analysis to explore the low
level implications of high level design decisions. Top-down design is
a divisive process of decomposition and constraint specification;
Bottom-up design is a constructive process of composition and
constraint verification.

The system structure should stabilize after several iterations
or rounds of yo-yo design. (If not, management may freeze the
design by fiat and the engineers must always live with the
consequences!) Detailed module design may then begin. A module
can be specified using natural language, pseudo-code or
mathematics, in order of increasing rigor. The modules are
scheduled for development in accordance with the
overall project schedule. Since the modules have been carefully
specified, construction and test can be carried out independently
by a single engineer. Thus, modular design is essential for a
successful team effort.

Although a module can be extensively tested in isolation, it
must operate correctly with its neighbors in the deliverable
product. The development of individual modules will be punctuated
by periods of integration when groups of functionally related
modules are combined and exercised together. This combine
and conquer strategy works in the team's favor as progressively
larger subsystems are brought to full operational status.

The independence between modules is not only important during
development, but is essential to maintenance, too. Bugs will
be discovered throughout the life of the product. If a high degree
of independence is achieved, a bug can be quickly isolated to
a particular module and most importantly, the fix can be confined
to that module. When a system function is spread across modules
in an inappropriate fashion, the location of an error is harder
to identify and the repair may require changes to all of the modules
involved.

This discussion raises the question of what criteria should be
used to partition a system into modules. David Parnas suggests the
principle of "information hiding" or "transparency" as the primary
criterion for decomposition. Each module should be treated as a
black box where the inner mechanism of the black box is known only
to the implementor. Other modules can communicate with the module
if its interface has been rigorously specified. Interactions with 
the module will be correct if both the module behaves correctly
with respect to its specification and any calling module adheres
to the interface specification. By hiding the inner mechanism,
a programmer or engineer cannot exploit some undocumented feature,
side-effect or data structure within a module. The implementors
are free to change the internal algorithm or data structures as
long as the mechanism behaves in accordance with the module
specification.

Section 3 - Conventions.

One way to reduce confusion between team members is the
use of coding or design conventions. Although the team may
be constructing a modular system, for example, they must
adopt certain conventions which govern the interaction between
the modules. Otherwise, each programmer or pair of programmers
will use a different intermodule communication scheme with
chaotic results. Integration and maintenance will be easier
if every module uses the same ritual or "protocol" for
communication with the other modules. This section describes
a generic convention that can be applied to hardware and
software systems alike.

       Module: Arithmetic

          Add:  Go Done   int a,b   int sum
          Sub:  Go Done   int a,b   int dif
          Mul:  Go Done   int a,b   int pro
          Div   Go Done   int a,c   int quo
          Neg   Go Done   int a     int neg

       Figure 2 - A module for arithmetic.

A module is a black box that performs one or more operations.
As stated earlier, the algorithm and data structures within
the black box are completely unknown. The behavior of a module,
therefore, is characterized solely by its external or visible
response to operational requests. Figure 2 depicts the external
interface of a module which performs the five standard
arithmetic operations. The interface show the names of the
five arithmetic operations and the names and types of the operands
to those operations. One could also add the mappings:

   Add: sum <- a + b
   Sub: dif <- a + b
   Mul: pro <- a * b
   Div: quo <- a / b
   Neg: neg <- - a

which shows the mathematical relationship between the operands
and result for each operation. (The formal specification of
module behavior is complex and interesting topic in itself
which cannot be addressed in detail here.)

Each operation has a button to invoke the operation (Go), a set of
named input slots for operands, a set of output slots for results,
and a light (Done) to indicate when the operation has completed.
An operation is started by filling the input slots for the operation
and hitting the Go button. When the completion light is illuminated,
the results can be removed from the output slots. With this simple,
but rather vague scheme in mind, let's examine some specific software
and hardware conventions for modules.

Section 4 - Software conventions.

A software module is a set of procedures, data types and variables
that implement a system function. Each module is separately coded
and compiled. One procedure is implemented for each external module
operation. An operation is invoked using the standard subroutine
call mechanism of the programming language and the operands are
provided through parameter passing. Completion is signaled when
the subroutine returns to the caller. Results are sent back through
call by name parameters or directly as the return value of a function
call.

Programmers working in a higher level language (C, Pascal, etc.)
get a lot of mechanism for free -- the subroutine call mechanism is
built into the language and its compiler. Assembly language programmers
must decide on a common call and return sequence. Questions to be
resolved include the location of operand values (registers, fixed
memory locations or call stack), the subroutine call instruction to
use, the location of the return address, which general registers
to save and restore, the location of return values, and the appropriate
subroutine return instruction to use.

The "access" to the procedures and variables within a module is
restricted by the scoping and access control features of the
programming language. The subroutines that implement the module
operations must be accessible to the modules which call those
subroutines. Further, the internal data types, variables and
utility procedures must be hidden from the other modules in the
system.

Two widely differing styles are supported in practice. Programming
languages such as Ada and Modula-2 have structurally enforced access
control. All procedures and variables are assumed to be hidden
unless they are explicitly "exported" from a module. A procedure
or variable must be explicitly "imported" before it can be accessed.
The programmer is consciously aware that an object is visible and
the visibility of an object can be checked by the compiler or
software development aids. In C or assembly language, however,
access control is relatively weak. A C function or global variable
is visible to any other function in the program unless it is
explicitly hidden using the keyword "static." Further, C and
assembly programs can manipulate and use addresses with impunity,
making any addressable location in the program fair game. Strongly
typed languages like Modula-2 or Pascal do not permit the uncontrolled
use of addresses and perform pointer range checking. Thus, it is hard
to get around the strong typing of the language and to make errors.
(Some system programmers will argue that strong typing unnecessarily
restricts their freedom.)

         Producer:             Consumer:
             .                     .
             .                     .
          Compute                Go(W)
        CondWait(W)           CondWait(R)
         Put Value             Get Value
           Go(R)                Compute
             .                     .
             .                     .

   Figure 3 - Producer and consumer example.

Communication between processes (programs) is essential for certain
kinds of applications like realtime process control. Thus, interprocess
communication is the second most prevalent kind of module interaction.
Figure 3 shows two processes. One process produces a data item which
is consumed by the other process. Since these two processes can execute
at their own speed, they must synchronize before passing the data item.
For example, we do not want the consumer to read a data item before it
is produced.

The producer and consumer process share three variables. "Value" is
the data item to be exchanged by the producer and consumer. "R" and "W"
are Boolean-valued semaphores which can be read and modified by the
procedures "Go" and "CondWait." When CondWait is applied to a
semaphore with the value "false," the process executing the CondWait
will suspend execution until the semaphore becomes "true." CondWait
will reset the semaphore to "false" before resuming execution.

Since the procedure "Go" makes a semaphore "true," the pair of procedures
CondWait and Go can be used to synchronize the execution of processes
as shown in Figure 3. The producer will compute the data item and
wait until the shared variable Value is available. (The semaphore W
controls the writing of Value.) After Value is written, the producer
will execute Go on the semaphore R and release the consumer. The
consumer first sets W to "true" telling the producer that it may
write Value. This is roughly equivalent to an operation request.
The consumer then waits for Value to be written and will read Value
after resuming execution. (The consumer may not have to wait. Why?)

Section 5 - Hardware conventions.

It is possible to draw a literal, hardware version of the module
communication mechanism given in Section 3. Figure 4 contains a
hardware module with "Go" and "Done" signaling. The module has
several input and output slots implying that operands and results
are communicated in parallel. As an alternative, values can be
communicated in a serial fashion, thereby reducing the number of
interconnection points to the module. Serial communication will
be slower, however.

             Go  Done  Operation
             |    ^    |
             V    |    V
            -------------
     ----->|             |----->
     ----->|             |----->
     ----->|   Module    |----->
     ----->|             |----->
     ----->|             |----->
            -------------

  Figure 4 - Go/Done module communication.

An operation is invoked in the following way (Figure 5.) First,
the caller sets up the operand values at the input ports and an
operation code at the Operation input. It then asserts the Go signal
and the module begins to execute. Sometime later, the module puts
the results to the output ports and asserts the Done signal. In
response, the caller drops Go followed by a drop in Done.

   Inputs     ____VVVVVVVVVVVVVVVV____________

   Operation  ____VVVVVVVVVVVVVVVV____________

   Outputs    ___________VVVVVVVVVVVVVVV______
                     ____________
   Go         ______|            |____________
                            ___________
   Done       _____________|           |______

       Figure 5 - Go and Done signaling.

This kind of signaling requires an explicit request to be
followed by an acknowledgment of execution. The signals
"Go" and "Done" also indicate the validity of the data on
the module inputs and outputs. The caller has the responsibility
to keep the operand data valid until the module has indicated
that it has consumed the data by asserting Done. The module
must keep the output data valid until the results have been
safely read by the caller (as indicated by the caller dropping
Go.)

Section 6 - Time.

Like the execution of a software subroutine, the amount of
time required to compute the result is flexible. The module
and its caller will operate correctly together as long as they
adhere to the signaling protocol. In the presence of long
wire delays (the switching and propagation time of intermodule
signals), some "deskewing" time must be added between the time
that the inputs are sent and Go is asserted and between the time
that the results are driven and Done is raised. The deskewing
time is a safety margin that permits the inputs and outputs to
settle (assume good electrical values) before they are sensed.

The signaling scheme described in Section 5 is called a "four
cycle handshake." By chaining the Done signal of one module
to the Go input of another module, the two modules can be made
to execute in sequence. (Some additional constraints on Go,
Done and data validity are required. What are they?) The speed at
which the operations are performed is determined by the speed of
each module, thus, this kind of signaling can be used to construct
"self-timed" systems. The four cycle handshake is an example of an
"asynchronous" communication scheme since it is relatively insensitive
to delay (with the exception of the deskewing time.)

                    Clock
                      |
              ----------------
             |                |
          Sender --------> Receiver

                         _________
      Clock ____________|         |_________

      Data  _______VVVVVVVVVVVV_____________

        Figure 6 - Clocked communication.

The primary disadvantage of the four-cycle handshake is the rather
lengthy transaction that is required to move data between modules.
The modules must communicate back and forth and thereby remain
synchronized. To gain speed, hardware designers make one important
assumption. They assume that they know the speed at which the
module can produce a result from the time that the operands are
applied to the inputs to the time that the output lines settle.
In order for this scheme to work, the sender and receiver must
be co-ordinated by a central authority which assures that both
the sender and receiver have the same notion of absolute time.
This authority is the "clock" and communication is governed
by "centralized, synchronous control."

The clock generates periodic pulses and keeps time for the system
like an orchestra conductor. Assuming that data must be valid
at the rising edge of the clock, two key temporal relationships
must be satisfied. First, the data must be valid for a period
of time before the rising edge called the "set-up time." Next,
the data signals must remain in a valid state for a minimum
period after the rising edge called the "hold time." (We will
see the physical justification for set-up and hold times when
we discuss logic and storage devices in a later chapter.) The
"clock period" or "cycle time" is the sum of the set-up time,
the hold time, and the actual computation time needed to produce
the result.

If any of these assumptions are violated in practice, system
operation will become unreliable.

  * The sending (producer) module may not have enough time to
    compute a valid result.
  * The receiver (consumer) module may not have enough time to
    reliably sense and capture the result.

These situations will arise if the clock period is unrealistically
shortened, a slower circuit is used to compute the result, or the
transmission time from sender to receiver is lengthened. Variability
in the manufacturing process is critical here because the delays
of physical components are not uniform and may vary from production
unit to production unit. Successful synchronous design, therefore,
requires careful timing analysis and defensive technique.

Copyright (c) 1987-2013 Paul J. Drongowski