Computer Design
A computer aided design and VLSI approach

Paul J. Drongowski

Chapter 16 - Technology and tools.

The system designer may choose from many implementation technologies
and will probably use a mixture in the actual design of the system.
Every technology has characteristics which affect design time and
project schedule, fabrication turnaround and cost, testing, execution
speed, space utilization, current consumption, power dissipation,
noise immunity and reliability.

This chapter is a brief overview of several alternative implementation
technologies. The advantages and disadvantages of each technology
are described along with the kinds of computer aided design (CAD) tools
which are available to assist the engineering team. For the sake of
brevity, we will not address differences in transistor structure and
behavior (field effect versus bipolar) or material (silicon versus
gallium arsenide.)

Section 1 - Small, medium and large scale circuits.

During the 1950's, vacuum tubes were the primary active elements in
digital logic. Tubes were replaced in the early 1960's by discrete
bipolar transistors with a substantial improvement in reliability,
current consumption, heat dissipation and switching speed. Consequently,
computers could be reduced from expensive, room-sized, power hungry
behemoths to modest size and power.

The invention of the integrated circuit in the late 1960's provided
the second revolutionary step in computer design. The transistorized
logic circuits which once occupied one to two square inches of board
area were reduced to 0.1 to 0.5 square inches of real estate. This
technological era gave birth to the ubiquitous 7400 series family of
logic devices. The 7400 series family spans from small scale integrated
(SSI) circuits, through medium scale (MSI) to large scale (LSI) circuits.
The family underwent changes in electronic structure as well beginning
with standard transistor-transistor logic (TTL) and to eventually
include high speed (74H) devices, very high speed (but power hungry)
Shottky logic (74S) and low power Shottky (74LS) devices.

Miniaturization provided several advantages for the system designer.
First and foremost, the physical size of a computer could be substantially
reduced. Several new products became viable including the mass produced
minicomputer and the extensive use of computing in airborne and space
systems. The availability of a standard logic family and monolithic
digital circuits shifted the focus of the system designer away from
electronic behavior toward the functional behavior of the system making
the design task much easier. Wiring complexity was reduced since only
integrated wires were now required; Wires between individual transistors
and associated discrete components were on-chip. Repair by gate (or IC)
replacement became feasible. In order to perform gate replacement with
discrete components, a separate board (or module) would be required for
each gate. This arrangement would increase the amount of backplane
(intermodule) wiring -- not a very economical option.

The higher functional density of programmable, custom and semi-custom
integrated circuits have relegated SSI, MSI and LSI circuits to a
support role in computer design. Very large scale integration (VLSI)
permits the implementation of an entire processing unit on a single
chip, for example, making the equivalent S/M/LSI implementation
excessively expensive in board area, current, power dissipation and
PCB-level wiring. SSI, MSI and LSI circuits now act primarily as the
functional "glue" that binds the high density parts together.

Section 2 - Gate/board level tools and production.

Many aspects of gate and board level design cut across technologies.
Therefore, we will discuss the design and implementation of PCB-based
systems in detail.

Gate level design is supported by computer based schematic capture,
test generation and simulation tools. A "schematic editor" is a
program that permits a designer to create and modify the structural
design of a digital system. The system structure is drawn graphically
on a display screen as a schematic or block diagram and is manipulated
with a mouse, pad or keyboard. Schematic editors support hierarchical
design by permitting the engineer to define a circuit and then to
instance that circuit within the definition of another block.

      ------------------
     | Schematic editor |<------------
      ------------------              |
                                      V
                                  ----------
      ------------------         |  Design  |
     | Test  generation |<-------|          |<------------
      ------------------         | Database |             |
              |                   ----------              |
              V                       |                   V
      ------------------              |             ---------------
     | Logic  simulator |<------------             |  PCB  layout  |
      ------------------                            ---------------

                 Figure 1 - Basic design tools.

The schematic is maintained in a central design database which is used
to drive test generation, simulation and layout activities. In order
to simulation the system, the engineering team must create the test
patterns to exercise the circuit. The patterns may be created by hand
or they may be produced by an "automatic test generation" (ATG) program.
Clearly, an ATG program has several advantages over manual test
generation. The ATG is faster, is less likely to omit test experiments,
and can automatically provide fault grading information. The ATG
program uses the structural information in the database and information
about primitive circuit behavior to build the test vectors.

Once test vectors are available, the system may be simulated at the
logic level. Production simulators use the test patterns from the test
generation step and commands from the engineer to exercise the system.
Results are displayed graphically as timing diagrams, digit sequences
or sometimes as assembly mnemonics. Simulation yields detailed timing
information and gives the designer more confidence in the functional
correctness of the design. However, transient errors (hazards),
bad timing assumptions and races may still elude the designer due to
the limitations of the algorithm used to simulate the system. It is
generally not practical to execute operating system or application
software on the logic simulator since hundreds of thousands of
signaling events are required to interpret even one ISA instruction.
Thus, software dry testing is best performed on ISA or organization
level models.

After the design has been tested, evaluated and accepted, it must
be fabricated. If a printed circuit board is the final product, the
individual circuit packages must be placed on the board and wires
must be routed between connections points according to the schematic.
This function is performed by an automatic PCB "place and route" tool
which accepts the schematic as its input and produces a "picture" of
package and wire layout. The "picture" (which may be a list of
instructions for a numerically controlled (N/C) pattern generator)
is used to form the photographic mask for the PCB production process.
Each layer in a multi-layer board will require a separate mask.

To produce a circuit board, the surface of a copper clad board is
first coated with "resist." The resist is an organic material which
breaks down chemically when exposed to a strong source of visible
or ultraviolet light. The image of the PCB layer pattern is optically
transfered to the resist by selectively covering portions of the resist
and exposing others through the photographic mask. The broken down
resist is removed exposing the copper beneath it. The board is dipped
in an etchant that removes the exposed copper. Finally, the remaining
resist is removed and the layer is bonded to other layers forming
a complete board. Components are automatically inserted in the
completed and inspected board by an N/C machine and are wave soldered
to the wire traces and pads.

The board is sent on to quality assurance testing. It is placed in
the fixture of an automatic test machine where test patterns are
applied and results are acquired. The simulation model developed
earlier provides a "known good device" or board, and thus, the same
test patterns may be used for QA testing. The physical limitations
of the automatic test equipment (ATE) may reduce coverage and
fault isolation due to restricted access to the internal signals
present on the board.

Section 3 - Programmable logic devices.

Programmable logic devices (PLD) are an excellent alternative to
random very low density logic. A PLD can subsume up to 2000 (1987)
equivalent SSI gates and are suitable for both combinational and
sequential logic. The PLD implementation will consume less power
and board space, and will generate less heat than the equivalent
SSI/MSI system.

A generic PLD structure is shown in Figure 2.
External inputs and feedback values are fed into an input block
containing latches and true/complement elements. The programmable
AND/OR array combines the input and feedback values and computes 
several sum of product terms which are sent to the output block.
The output block contains storage elements and off-chip signal
drivers. The structure of the PLD makes it ideal for glue logic
and the implementation of small state machines.

                   -----------------------------------
                  |                                   |
                  V                                   |
               -------       --------------       --------
              | Input |     | Programmable |     | Output |
   Input ---->|       |---->|              |---->|        |----> Output
    Pins      | Block |     | AND/OR array |     | Block  |      Pins
               -------       --------------       --------

       Figure 2 - Generic PLD structure.
                  (Source: Intel User Defined Logic Handbook 1986.)

Programming may be accomplished using bipolar fused link devices or
electrically programmable ROM (EPROM) technology. Bipolar fused devices
consume more current and operate hotter than MOS technology circuits.
The fuse itself is large, it limits the functional density of the PLD
and restricts pre-delivery testing. (Once blown, the fuse cannot be
reconnected.) Erasable PLD's (EPLD) use EPROM technology which permit
re-programmability. If an error is found, the logic control elements
(memory cells) can be exposed to ultraviolet light, thereby erasing
the array. Erasability also permits full QA testability. EPLD's are
implemented in more energy conservative MOS technology such as the
CMOS devices available from Intel.

Since one PLD can satisfy the needs of many different customers, it
can be produced and tested in very large volume. Per device cost,
therefore, is low due to the economies of scale in the production
process. PLD's can be programmed by a customer in the field using
standard PROM burning instruments. Turnaround time from an engineering
change to a newly programmed device is a matter of hours.

Vendors such as Intel supply personal computer based development
software for PLD's. These tools accept a variety of descriptions
including gate schematics, boolean equations and truth tables.
Sequential logic can be described using state equations or diagrams.
The vendor software optimizes the logic design and converts the PLD
description to device level programming information (such as the
standard JEDEC format) which is subsequently sent to a PROM programmer
where the pattern is written into the device. The device may then be
tested by applying test vectors or reading back the programmed data.

Programmability and low per unit cost are obtained at the expense
of high circuit density. Custom VLSI circuits consisting of 100,000+
gate equivalents are possible and gate arrays with 12,000+ equivalents
are in commercial use.

Section 4 - VLSI building blocks.

The most economical way to exploit VLSI technology is to design
an existing VLSI part into a product. VLSI building blocks are
available for a broad range of applications from engineering
workstations and personal computers to consumer appliances. A few
examples are given below.

  * The most glamorous parts are the high performance microcomputer
    CPU's. These parts often include on-chip instruction and data
    caches, memory management and bus arbitration circuits -- elements
    that where found only on "super" minicomputers in the mid to late
    1970's.
  * The designer may surround the microcomputer CPU with high density
    memory and peripheral devices. Peripheral circuits are available
    for direct memory access (DMA) transfers, serial and parallel data
    communication, local area network interfacing, disk control
    and graphics. (This list is NOT exhaustive!) Peripheral circuits
    are designed to be bus compatible with the main CPU and enhance
    its marketability.
  * Single chip microcontrollers with a CPU and limited amounts of
    RAM, ROM and I/O capability are quite suitable for the consumer
    market where part count and power must be as small as possible.

Glue logic must be provide to co-ordinate the activities and interactions
of the VLSI parts. PLD's and gate arrays are good choices for one of a
kind glue logic and are often referred to as "application specific
integrated circuits" (ASIC) for this reason.

Although microcomputer components are too slow to emulate an ISA
at a competitive performance level, they are still essential
ingredients in a computer system design. Microcomputers make
suitable I/O controllers and channel processors, off-loading many
time and space consuming tasks from the central processor. A
microcomputer based magnetic tape controller, for example, can
manage a large data cache, control tape tension and motion, compute
checksums and retry unsuccessful data transfers without CPU
intervention. In a large mainframe, a microcomputer maintenance
processor can load microcode into a writable control store and
run diagnostics. Fault detection and isolation can be performed
remotely if the maintenance processor is attached to the phone
system through a modem.

VLSI building blocks can be incorporated into a design by
instancing it from a part library through the schematic editor.
Accurate gate level models of VLSI components are difficult,
if not impossible, to obtain. ISA or organization level simulation
using a programming or hardware description language (ISP or VHDL,
for example) may be used instead. Ideally, one would like to
perform "mixed mode" simulation -- execute the ISA at the register
transfer level and simulate the glue at the logic level. The
effect of instruction evaluation is simulated with the
efficiencies of register transfer execution while gate logic is
simulated in detail at the cost of a higher wall clock to simulated
time ratio.

Since the operation of a VLSI component is proprietary, vendors do
not like to publish much about the internal behavior of their VLSI
products. This makes the construction of accurate gate, ISA or
organization level models very difficult. One approach to this
problem is to directly interface the simulator to the actual VLSI
device where the part responds to the stimuli produced by the
simulated gate network. The Valid Logic "RealChip" system uses this
technique.

Section 5 - Gate arrays.

"Gate arrays" and "standard cells" are two forms of semi-custom
integrated circuit technology. The term "semi-custom" implies that
a portion of the system has been predetermined or predesigned.
In gate array technology, much of the physical layout has been
determined for the designer. Standard cells, which are discussed
in the next section, are predesigned cells to be placed and
interconnected on the surface of the chip. The vendor hides the
intimate details of the layout and fabrication process from the
designer, thereby freeing the engineer of those tasks. The designer
can build the system in terms of logic level building blocks
without knowing much about transistor circuit behavior. "Full
custom" design requires knowledge of low level device behavior,
the fabrication process and layout. The full potential of VLSI
technology (especially high functional density) can be exploited
in custom design.

A gate array consists of a predefined core of gate cells. The core
is a highly structured, regular array of uncommitted gates and is
surrounded by interface circuits and bonding pads (Figure 3.) The
cells are interconnected by metal wires which are present on the last
layers to be added to the integrated circuit. (Usually two metal layers
are available for wiring.) The vendor produces uncommitted gate array
wafers in high volume leaving out the final metalization steps. The
design team gives the vendor a schematic (in the form of an
interconnection "netlist") that is in turn transformed to one or more
fabrication masks. The final metal wires are added to the pre-manufactured
uncommitted gate array wafers using those masks.


   -------------------------------
  |   -   -   -   -   -   -   -   |
  |  |P| |P| |P| |P| |P| |P| |P|  |
  |   -   -   -   -   -   -   -   |
  |       -----------------       |
  |   -  |                 |  -   |
  |  |P| |                 | |P|  |
  |   -  |                 |  -   |
  |   -  |      Array      |  -   |
  |  |P| |                 | |P|  |
  |   -  |      Cells      |  -   |
  |   -  |                 |  -   |
  |  |P| |                 | |P|  |
  |   -  |                 |  -   |
  |   -  |                 |  -   |
  |  |P| |                 | |P|  |
  |   -  |                 |  -   |
  |       -----------------       |
  |   -   -   -   -   -   -   -   |
  |  |P| |P| |P| |P| |P| |P| |P|  |
  |   -   -   -   -   -   -   -   |
   -------------------------------

     Figure 3 - Generic gate array structure.

Production costs are low because the uncommitted wafers can be
produced in large volume. Many different customers can use the same
array design since it is the final wiring which actually determines
the logical function to be performed by the IC. Turnaround time
(the time from a final design to working prototype) is short and
is on the order of four weeks -- another benefit of preprocessed
wafers. Design time is also shortened as the vendor knows and carefully
controls the process and transistor behavior. Cell libraries of
predefined logic elements further speed design giving the team
tested functional blocks such as multiplexers, full adders, lookahead
carry, flip/flops, clock generation and a wide variety of AND/OR
gate topologies. (This approach is often called "macrocells.") The
engineer may not change the internal design of the cells, however.

To further reduce design time, the vendor will route the wires
through the array. This step is performed by a software "routing"
tool driven by the netlist provided by the engineering team. Vendors
conscientiously support the netlist formats employed by the most popular
workstation based CAD packages. Thus, the path from schematic to the
silicon prototype is completely automated.

It is usually the customer's responsibility to provide test patterns
and expected results for post-production quality assurance testing to
be performed by the vendor. The test vectors and results from the
gate level logic simulation can be used here. Of course, the test tape
must be written using a common format.

Gate arrays are most suitable for low volume applications or for
"short fuse" projects where fast turnaround is important. Although
gate arrays have been used for CPU design, they are an excellent
choice for the implementation of glue logic.

Section 6 - Standard cells.

Standard cells are another kind of semi-custom technology. Unlike the
gate array approach, chip area is completely uncommitted and the full
mask set must be used to produce a finished circuit. The design team
selects and interconnects standard cells from a library of predesigned
circuits. Each cell is a full custom design and no restrictions are
placed upon the layout (i.e., no predefined structure.) Thus, standard
cells provide higher density than gate array (50,000 gates versus
12,000 in 1987.) Cell placement and routing is automated yielding the
similar benefits. The cells tend to have a fixed "pitch" (height or
width) which lends uniformity to the circuit geometry. Like gate arrays,
CMOS is the predominant process.

Due to their full custom implementation, standard cells can more
readily provide high density, large grain, functional blocks than
gate array technology. The Supercell library from NCR Microelectronics,
for example, includes ROM, RAM, PLA, counter, microprocessor,
A/D and D/A conversion blocks, an SCSI controller and analog building
blocks along with the customary SSI, MSI and I/O circuits (Table 1.)
The standard cell approach is economically viable for moderate to
high volume applications. Although denser than gate arrays for a given
application, provisions for layout and routing generally result in
a less dense and large IC than full custom. Use of the full mask set
means a longer layout and fabrication process with respect to gate
array design and manufacture. However, the greater density and smaller
size of standard cell based IC's will have higher production yields.

      SSI functions.
      Flip/flops and latches.
      MSI functions.
        Shift register.
        Up/down counter.
        Full and half adders.
      I/O pads and buffers.
        Input cells (TTL compatible.)
        Output cells (open drain, pullup, etc.)
        Tristate.
      Analog cell library.
        Op amps and comparators.
        Analog switch.
        Logic level shifter.
        Power on reset.
        Oscillators.
      Supercells.
        Modular ROM.
        Modular RAM.
        Modular PLA.
        Modular EEPROM.
        Counter/timer.
        A/D and D/A conversion.
        65C02 microprocessor.
        68C05 microprocessor.
        SCSI controller.
        Sound generators.

     Table 1 - Typical standard cell library (NCR.)

Section 7 - Custom VLSI circuits.

In full custom design, chip area is totally uncommitted and the
designer must determine the size and placement of every transistor
and wire. It also requires knowledge of device electronics and the
processing technology. Full custom circuits have the highest density
and give maximum design flexibility. These advantages are bought
at the price of a longer development schedule with a much greater
commitment of resources (engineering time, workstations, etc.)
The higher development cost incurred must be offset by high
production volume in order to amortize non-recurring costs across
the largest possible number of production units (Table 2.)

                                  Production
  Technology       Preprocessing     Cost      Lead time
  ------------------------------------------------------
  SSI/MSI/LSI       100 percent      None        None
  PLD               100 percent      None        None
  VLSI blocks       100 percent      None        None
  Gate arrays        75 percent    $ 20,000     1 month
  Standard cells      0 percent    $100,000     5 months
  Full custom         0 percent    $250,000    18 months

       Table 2 - Alternative technologies.

Architecture, organization and logic (switch) level simulation
must be augmented by electrical simulation. This additional
simulation step is needed to determine transistor sizes, delay,
current consumption and power dissipation characteristics for
the subsystems. Certain aggressive design techniques (e.g.,
precharge, exploitation of analog circuit behavior) must be
verified through extensive and often expensive simulation.

In the early days of custom design, the circuit geometry was drawn
by hand on room-sized paper layouts. This has been replaced by
computer workstation based layout tools which support hierarchical
regular design. Regularity provides the substantial leverage on the
layout problem as a few custom circuit designs can be replicated
and used many times throughout the system. Thus, the cost of layout
and analysis is spread across many instances of a few custom cells.
Development cost can be further reduced through:

  * "Module generators" such as PLA layout generation tools,
  * "Parameterized and stretchable cells" that adapt to specific
    design situations, and
  * "Silicon compilers" that automatically translate from the building
    block structure to the layout (possibly using a mixture the first
    two techniques.)

These tools move the designer away from transistor level design to
the use of large grain functional blocks.

Copyright (c) 1987-2013 Paul J. Drongowski