PDP-11 family implementation P.J. Drongowski References. Impact of implementation design trade-offs on Performance: The PDP-11, A case study, Edward A. Snow, Daniel P. Siewiorek, Carnegie-Mellon University, CMU-CS-78-104, February 1978. General notes. PDP-11/70, -11/50, and 11/45 all use the same processor. Three different LSI implementation. + LSI-11 (3/4 chip set.) + T-11 (single chip.) - 40 pin package. - nMOS technology, 7.5 MHz clock. - No floating point or multiply/divide. + J-11. - Two chips on single package, thin film interconnect. - 60 pin ceramic package. - CMOS technology, 20 MHz clock, power < 1 watt. - 16-bit I/O. - 32-bit internal datapath, pipelined, instruction prefetch. - 8 Kb on-chip cache memory. - Addresses up to 4 Mbytes with on-chip memory management. - Two general register sets (kernel, supervisor, user.) - Six 64-bit floating point accumulators. Performance and technology Rel Perform Tech Integ ALU Registers LSI-11 1.000 nMOS LSI 8-bit nMOS 26 x 8-bits 11/04 1.455 TTL MSI 74181/182 16 x 16-bits 11/10 1.436 TTL MSI 74181/182 16 x 16-bits 11/20 1.667 TTL SSI 7482 adders 16 x 16-bits 11/34 1.942 S-TTL MSI 74S181/S182 16 x 16-bits 11/40 2.819 TTL MSI 74181/182 16 x 16-bits 11/45 6.920 S-TTL MSI 74S181/S182 2 x 16 x 16-bits 11/60 3.727 S-TTL MSI 74S181/S182 2 x 16 x 16-bits Control characteristics. Pkg Pc/UNIBUS Control Microcycle CS size count synch LSI-11 Vertical 400 22 x 1024 48 Interlocked 11/04 Horizontal 260 40 x 256 138 Interlocked 11/10 Horizontal 300 40 x 256 203 Overlapped 11/20 Random logic 280 N/A 523 Interlocked 11/34 Horizontal 180/240 48 x 512 231 Interlocked 11/40 Horizontal 140/200/300 56 x 256 417 Overlapped 11/45 Horizontal 150 64 x 256 696 Overlapped 11/60 Horizontal 170 48 x 2560 648 Interlocked Structure. Archetypal medium-range PDP-11 datapath. * Registers. + Instruction register (IR.) + Bus address register (BA.) + Register file. + B operand register. + Processor status register (low 8 bits.) * Multiplexers. + A operand select. - Inputs: Constants, register file, processor status. - Output: ALU A operand input. + B operand select. - Inputs: B operand register and constants. - Output: ALU B operand input. + ALU result & bus data select. + Condition code select. * Feedback from ALU result MUX to IR, BA, file, B reg, CC MUX. Archetypal microprogrammed control unit. * Control store sends uword to uword register. * Feedback from CS to BUT logic and next uaddr OR gates. (BUT = branch on microtest.) * Next addr OR gates feed uaddr register to select next uword. * BUT logic accepts state information from datapaths. LSI-11 variations. * Variations driven by technology. (Small number of I/O pins and silicon.) * Grossly simplified datapath. - 8-bit registers, ALU and transfers. - Multiplexed external bus address and data signals. PDP-11/45 variations. * Variations driven by need for high performance. * More complicated datapath for higher degree of parallelism. - Two copies of register data to eliminate shuffles. - Fastbus data inputs and outputs. (Fastbus is a synchronous processor-memory bus for semiconductor RAM.) - Program counter is not maintained in register files. Performance improvements. * Memory. + Decrease number of memory read operations per instruction. Of limited usefulness because number may be fixed by architecture. + Decrease effective memory read pause time. - Faster memory components. - Cache memory. * Processor. + Decrease number of processor cycles per function. - Structure datapaths for maximum parallelism. - Design micro-instruction and branch logic to take best advantage of instruction features. - Cut effective microcycle count by overlapping processor and UNIBUS operation. + Decrease microcycle time. - Make datatpaths faster (with faster technology components.) - make each cycle only as long as necessary (multiphase clock.) Copyright (c) 1986 Paul J. Drongowski