Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
CDC 6600
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Instruction-set architecture of CP== The basis for the 6600 CPU is what would later be called a [[Reduced instruction set computer|RISC]] system,{{disputed-inline|Instruction-set architecture|for=variable length instructions|date=December 2023}} one in which the processor is tuned to do instructions that are comparatively simple and have limited and well-defined access to memory. The philosophy of many other machines was toward using instructions which were complicated — for example, a single instruction which would fetch an operand from memory and add it to a value in a register. In the 6600, loading the value from memory would require one instruction, and adding it would require a second one. While slower in theory due to the additional memory accesses, the fact that in well-scheduled code, multiple instructions could be processing in parallel offloaded this expense. This simplification also forced programmers to be very aware of their memory accesses, and therefore code deliberately to reduce them as much as possible.{{cn|date=August 2022}} The CDC 6600 CP, being a three-address machine, allows for the specification of all three operands.<ref>{{citation |title=Computer Architecture and Organization |last=Hayes |first=John.P |isbn=0-07-027363-4 |date=1978 |page=163|publisher=McGraw-Hill }}</ref> ===Models=== The [[CDC 6000 series]] included four basic models, the [[CDC 6000 series#Versions|CDC 6400]], the [[CDC 6000 series#Versions|CDC 6500]], the CDC 6600, and the CDC 6700.{{When|date=May 2022}} The models of the 6000 series differed only in their CPUs, which were of two kinds, the 6400 CPU and the 6600 CPU. The 6400 CPU had a unified arithmetic unit, rather than discrete [[Execution unit|''functional units'']]. As such, it could not overlap instructions' execution times. For example, in a 6400 CPU, if an add instruction immediately followed a multiply instruction, the add instruction could not be started until the multiply instruction finished, so the net execution time of the two instructions would be the sum of their individual execution times. The 6600 CPU had multiple functional units which could operate simultaneously, ''i.e.'', "in [[Parallel computing|parallel]]", allowing the CPU to overlap instructions' execution times. For example, a 6600 CPU could begin executing an add instruction in the next CPU cycle following the beginning of a multiply instruction (assuming, of course, that the result of the multiply instruction was not an operand of the add instruction), so the net execution time of the two instructions would simply be the (longer) execution time of the multiply instruction. The 6600 CPU also had an ''instruction stack'', a kind of ''[[CPU cache#ICACHE|instruction cache]]'', which helped increase CPU throughput by reducing the amount of CPU idle time caused by waiting for memory to respond to instruction fetch requests. The two kinds of CPUs were instruction compatible, so that a program that ran on either of the kinds of CPUs would run the same way on the other kind but would run faster on the 6600 CPU. Indeed, all models of the 6000 series were fully inter-compatible. The CDC 6400 had one CPU (a 6400 CPU), the CDC 6500 had two CPUs (both 6400 CPUs), the CDC 6600 had one CPU (a 6600 CPU), and the CDC 6700 had two CPUs (one 6600 CPU and one 6400 CPU). ==={{anchor|60-bit floating point}}Central Processor (CP)=== {| class="infobox" style="font-size:88%;width:32em;" |- |+ CDC 6x00 registers |- | {| style="font-size:88%;" |- | style="width:10px; text-align:center;"| <sup>5</sup><sub>9</sub> | style="width:160px; text-align:center;"| . . . | style="width:10px; text-align:center;"| <sup>1</sup><sub>7</sub> | style="width:70px; text-align:center;"| . . . | style="width:10px; text-align:center;"| <sup>0</sup><sub>0</sub> | style="width:auto;" | ''(bit position)'' |- |colspan="6" | '''Operand registers''' ''(60 bits)'' |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X0 | style="width:auto; background:white; color:black;"| Register 0 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X1 (read) | style="width:auto; background:white; color:black;"| Register 1 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X2 (read) | style="width:auto; background:white; color:black;"| Register 2 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X3 (read) | style="width:auto; background:white; color:black;"| Register 3 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X4 (read) | style="width:auto; background:white; color:black;"| Register 4 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X5 (read) | style="width:auto; background:white; color:black;"| Register 5 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X6 (write) | style="width:auto; background:white; color:black;"| Register 6 |- style="background:silver;color:black" | style="text-align:center" colspan="5"| X7 (write) | style="width:auto; background:white; color:black;"| Register 7 |- |colspan="6" | '''Address registers''' ''(18 bits)'' |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A0 | style="width:auto; background:white; color:black;"| Address 0 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A1 (read address) | style="width:auto; background:white; color:black;"| Address 1 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A2 (read address) | style="width:auto; background:white; color:black;"| Address 2 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A3 (read address) | style="width:auto; background:white; color:black;"| Address 3 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A4 (read address) | style="width:auto; background:white; color:black;"| Address 4 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A5 (read address) | style="width:auto; background:white; color:black;"| Address 5 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A6 (write address) | style="width:auto; background:white; color:black;"| Address 6 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| A7 (write address) | style="width:auto; background:white; color:black;"| Address 7 |- |colspan="6" | '''Increment registers''' ''(18 bits)'' |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B0 ''(all bits zero)'' | style="width:auto; background:white; color:black;"| Increment 0 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B1 | style="width:auto; background:white; color:black;"| Increment 1 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B2 | style="width:auto; background:white; color:black;"| Increment 2 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B3 | style="width:auto; background:white; color:black;"| Increment 3 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B4 | style="width:auto; background:white; color:black;"| Increment 4 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B5 | style="width:auto; background:white; color:black;"| Increment 5 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B6 | style="width:auto; background:white; color:black;"| Increment 6 |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| B7 | style="width:auto; background:white; color:black;"| Increment 7 |- |colspan="6" | '''Program address''' ''(18 bits)'' |- style="background:silver;color:black" | style="text-align:center;background:#FFF" colspan="2"| | style="text-align:center" colspan="3"| P |} <!-- Missing Status / Condition Code flags --> |} The Central Processor (CP) and main memory of the 6400, 6500, and 6600 machines had a 60-bit word length. The Central Processor had eight general purpose [[60-bit computing|60-bit]] [[processor register|registers]] X0 through X7, eight [[18-bit computing|18-bit]] address registers A0 through A7, and eight 18-bit "increment" registers B0 through B7. B0 was held at zero permanently by the hardware. Many programmers found it useful to set B1 to 1, and similarly treat it as inviolate. The CP had no instructions for input and output, which are accomplished through Peripheral Processors (below). No opcodes were specifically dedicated to loading or storing memory; this occurred as a side effect of assignment to certain A registers. Setting A1 through A5 loaded the word at that address into X1 through X5 respectively; setting A6 or A7 stored a word from X6 or X7. No side effects were associated with A0. A separate hardware load/store unit, called the ''stunt box'', handled the actual data movement independently of the operation of the instruction stream, allowing other operations to complete while memory was being accessed, which required eight cycles, in the best case. The 6600 CP included ten parallel functional units, allowing multiple instructions to be worked on at the same time. Today,{{Clarify timeframe|date=May 2022}} this is known as a [[superscalar processor]] design, but it was unique for its time. Unlike most modern CPU designs, functional units were not pipelined; the functional unit would become busy when an instruction was "issued" to it and would remain busy for the entire time required to execute that instruction. (By contrast, the CDC 7600 introduced pipelining into its functional units.) In the best case, an instruction could be issued to a functional unit every 100 ns clock cycle. The system read and decoded instructions from memory as fast as possible, generally faster than they could be completed, and fed them off to the units for processing. The units were: * floating point multiply (two copies) * floating point divide * floating point add * "long" integer add * incrementers (two copies; performed memory load/store) * shift * Boolean logic * branch Floating-point operations were given pride of place in this [[Computer architecture|architecture]]: the CDC 6600 (and kin) stand virtually alone in being able to execute a 60-bit [[Floating-point arithmetic|floating point]] multiplication in time comparable to that for a program branch. A recent analysis by Mitch Alsup of James Thornton's book, "Design of a Computer", revealed that the 6600's Floating Point unit is a 2 stage pipelined design. Fixed point addition and subtraction of 60-bit numbers were handled in the Long Add Unit, using [[ones' complement]] for negative numbers. Fixed point multiply was done as a special case in the floating-point multiply unit—if the exponent was zero, the FP unit would do a single-precision 48-bit floating-point multiply and clear the high exponent part, resulting in a 48-bit integer result. Integer divide was performed by a macro, converting to and from floating point.<ref>{{Cite web |url=http://ed-thelen.org/comp-hist/CDC-6600-R-M.html#P3-21 |title=Archived copy |access-date=2005-06-13 |archive-url=https://web.archive.org/web/20140102194752/http://ed-thelen.org/comp-hist/CDC-6600-R-M.html#P3-21 |archive-date=2014-01-02 |url-status=dead }}</ref> Previously executed instructions were saved in an eight-word [[CPU cache|cache]], called the "stack". In-stack jumps were quicker than out-of-stack jumps because no memory fetch was required. The stack was flushed by an unconditional jump instruction, so unconditional jumps at the ends of loops were conventionally written as conditional jumps that would always succeed. The system used a 10 [[Hertz|MHz]] clock, with a [[clock signal#4-phase clock|four-phase signal]]. A floating-point multiplication took ten cycles, a division took 29, and the overall performance, taking into account memory delays and other issues, was about 3 [[FLOPS|MFLOPS]]. Using the best available compilers, late in the machine's history, [[Fortran|FORTRAN]] programs could expect to maintain about 0.5 MFLOPS. ===Memory organization=== User programs are restricted to use only a contiguous area of main memory. The portion of memory to which an executing program has access is controlled by the ''RA'' (Relative Address) and ''FL'' (Field Length) registers which are not accessible to the user program. When a user program tries to read or write a word in central memory at address ''a'', the processor will first verify that a is between 0 and FL-1. If it is, the processor accesses the word in central memory at address RA+a. This process is known as [[Base and bounds|base-bound]] relocation; each user program sees core memory as a contiguous block words with length FL, starting with address 0; in fact the program may be anywhere in the physical memory. Using this technique, each user program can be moved ("relocated") in main memory by the operating system, as long as the RA register reflects its position in memory. A user program which attempts to access memory outside the allowed range (that is, with an address which is not less than FL) will trigger an interrupt, and will be terminated by the operating system. When this happens, the operating system may create a [[core dump]] which records the contents of the program's memory and registers in a file, allowing the developer of the program a means to know what happened. Note the distinction with [[virtual memory]] systems; in this case, the entirety of a process's addressable space must be in core memory, must be contiguous, and its size cannot be larger than the real memory capacity. All but the first seven [[CDC 6000 series]] machines could be configured with an optional Extended Core Storage (ECS) system. ECS was built from a different variety of core memory than was used in the central memory. This memory was slower, but cheap enough that it could be much larger. The primary reason was that ECS memory was wired with only two wires per core (contrast with five for central memory). Because it performed very wide transfers, its sequential transfer rate was the same as that of the small core memory. A 6000 CPU could directly perform block memory transfers between a user's program (or operating system) and the ECS unit. Wide data paths were used, so this was a very fast operation. Memory bounds were maintained in a similar manner as central memory, with an RA/FL mechanism maintained by the operating system. ECS could be used for a variety of purposes, including containing user data arrays that were too large for central memory, holding often-used files, swapping, and even as a communication path in a multi-mainframe complex. ===Peripheral Processors (PPs)=== To handle the "housekeeping" tasks, which in other designs were assigned to the CPU, Cray included ten other processors, based partly on his earlier{{When|date=May 2022}} computer, the CDC 160-A. These machines, called Peripheral Processors, or PPs, were full computers in their own right, but were tuned to performing [[Input/output|I/O]] tasks and running the operating system. (Substantial parts of the operating system ran on the PP's; thus leaving most of the power of the Central Processor available for user programs.) Only the PPs had access to the I/O [[Channel I/O|channels]]. One of the PPs (PP0) was in overall control of the machine, including control of the program running on the main CPU, while the others would be dedicated to various I/O tasks; PP9 was dedicated to the system console. When the CP program needed to perform an operating system function, it would put a request in a known location (''Reference Address'' + 1) monitored<ref>This description covers early versions of CDC software; later versions used the Central Exchange jump (XJ) instruction to reduce the overhead for functions that could be performed entirely in the CP.</ref> by PP0. If necessary, PP0 would assign another PP to load any necessary code and to handle the request. The PP would then clear RA+1 to inform the CP program that the task was complete. The unique role of PP0 in controlling the machine was a potential single point of failure, in that a malfunction here could shut down the whole machine, even if the nine other PPs and the CPU were still functioning properly. Cray fixed this in the design of the successor 7600, when any of the PPs could be the controller, and the CPU could reassign any one to this role.{{When|date=May 2022}} Each PP included its own memory of 4096 [[12-bit computing|12-bit]] words. This memory served for both for I/O buffering and program storage, but the execution units were shared by ten PPs, in a configuration called the [[Barrel processor|Barrel and slot]]. This meant that the execution units (the "slot") would execute one instruction cycle from the first PP, then one instruction cycle from the second PP, etc. in a round robin fashion. This was done both to reduce costs, and because access to CP memory required 10 PP clock cycles: when a PP accesses CP memory, the data is available next time the PP receives its slot time. ====Central Processor access==== In addition to a conventional instruction set, the PPs have several instructions specifically intended to communicate with the central processor.<ref name=6000Ref />{{rp|pp.4-24–4-27}} * <code>CRD d</code> - transfers one 60-bit word from central memory at the address specified by the PPs ''A'' register to five consecutive PP words beginning at address ''d''. * <code>CRM d,m</code> - similar to CRD, but transfers a block of words whose length was previously stored at location ''d'' into PP memory starting at PP address ''m''. * <code>CWD d</code> - assembles five consecutive PP words beginning at location ''d'', and transfers them to the central memory location specified by register ''A''. * <code>CWM d,m</code> - transfers a block starting at PP memory address ''m'' to central memory. The central memory address was stored in register ''A'', and the length was stored at location ''d'' prior to execution. * <code>RPN</code> - transfers the contents of the central processor's program address register to the PP's ''A'' register. * <code>EXN</code> - ''Exchange Jump'' transmits an address from the ''A'' register and tells the processor to perform an ''Exchange Jump'' using the address specified. The CP Exchange Jump interrupts the processor, loads its registers from the specified location and stores the previous contents at the same location. This performs a task switch.<ref name="6000Ref">{{cite book |last1=Control Data Corporation |title=Control Data® 6000 Series Computer Systems Reference Manual |date=1965 |url=http://bitsavers.org/pdf/cdc/Tom_Hunter_Scans/6000_Series_Computer_Systems_RefMan_Jul65.pdf |access-date=March 28, 2023}}</ref>{{rp|pp.3-9–3-10}} ===Wordlengths, characters=== The central processor has [[60-bit computing|60-bit]] words, while the peripheral processors have [[12-bit computing|12-bit]] words. CDC used the term "byte" to refer to 12-bit entities used by peripheral processors; characters are 6-bit, and central processor instructions are either 15 bits, or 30 bits with a signed 18-bit address field, the latter allowing for a directly addressable memory space of 128K words of central memory (converted to modern terms, with 8-bit bytes, this just under 1 MB). The signed nature of the address registers limits an individual program to 128K words. (Later CDC 6000-compatible machines could have 256K or more words of central memory, budget permitting, but individual user programs were still limited to 128K words of CM.) Central processor instructions start on a word boundary when they are the target of a jump statement or subroutine return jump instruction, so no-op instructions are sometimes required to fill out the last 15, 30 or 45 bits of a word. Experienced assembler programmers could fine-tune their programs by filling these ''no-op'' spaces with misc instructions that would be needed later in the program. The [[Six-bit character code|6-bit characters]], in an encoding called [[CDC display code]],<ref>The term "Display code" was associated with CDC much as "EBCDIC" was *originally* associated with IBM. Other terms used in the industry were BCD and SIXBIT (the latter being preferred by DEC)</ref><ref>{{cite web |url=http://rabbit.eng.miami.edu/info/decchars.html |title=DEC/PDP Character Codes}}</ref><ref>{{cite web |url=http://nemesis.lonestar.org/reference/telecom/codes/sixbit.html |title=SIXBIT Character Code Reference |access-date=2017-10-15 |archive-url=https://web.archive.org/web/20161124020239/http://nemesis.lonestar.org/reference/telecom/codes/sixbit.html |archive-date=2016-11-24 |url-status=dead }}</ref> could be used to store up to 10 characters in a word. They permitted a character set of 64 characters, which is enough for all upper case letters, digits, and some punctuation. It was certainly enough to write FORTRAN, or print financial or scientific reports. There were actually two variations of the CDC display code character sets in use — 64-character and 63-character. The 64-character set had the disadvantage that the ":" (colon) character would be ignored (interpreted as zero fill) if it were the last character in a word. A complementary variant, called [[CDC display code#6/12 display code|6/12 display code]], was also used in the [[CDC Kronos|Kronos]] and [[NOS (operating system)|NOS]] timesharing systems to allow full use of the [[ASCII]] character set in a manner somewhat compatible with older software.<ref>{{cite web |url=http://www.bitsavers.org/pdf/cdc/cyber/cyber_70/kronos/60407600B_KRONOS2.1ug_May74.pdf |title=CDC Kronos}}</ref> With no byte addressing instructions at all, code had to be written to pack and shift characters into words. The very large words, and comparatively small amount of memory, meant that programmers would frequently economize on memory by packing data into words at the bit level. Due to the large word size, and with 10 characters per word, it was often faster to process a word's worth of characters at a time, rather than unpacking/processing/repacking them. For example, the CDC [[COBOL]] compiler was actually quite good at processing decimal fields using this technique. These sorts of techniques are now{{When|date=May 2022}} commonly used in the "multi-media" instructions of current processors. ===Physical design=== [[Image:CDCcordwood1.jpg|thumb|right|300px|A CDC 6600 [[Printed circuit board#Cordwood construction|cordwood logic module]] containing 64 silicon transistors. The coaxial connectors are test points. The module is cooled conductively via the front panel. The 6600 model contained nearly 6,000 such modules.<ref>Understanding Computers: Speed and Power 1990, p. 17.</ref>]] The machine was built in a plus-sign-shaped cabinet with a pump and heat exchanger in the outermost {{convert|18|in|cm|abbr=on}} of each of the four arms. Cooling was done with [[Freon]] circulating within the machine and exchanging heat to an external chilled water supply. Each arm could hold four chassis, each about {{convert|8|in|cm|abbr=on}} thick, hinged near the center, and opening a bit like a book. The intersection of the "plus" was filled with cables that interconnected the chassis. The chassis were numbered from 1 (containing all 10 PPUs and their memories, as well as the 12 rather minimal I/O channels) to 16. The main memory for the CPU was spread over many of the chassis. In a system with only 64K words of main memory, one of the arms of the "plus" was omitted. The logic of the machine was packaged into modules about {{convert|2.5|in|mm|abbr=on}} square and about {{convert|1|in|cm|abbr=on}} thick. Each module had a connector (30 pins, two vertical rows of 15) on one edge, and six test points on the opposite edge. The module was placed between two aluminum cold plates to remove heat. The module consisted of two parallel printed circuit boards, with components mounted either on one of the boards or between the two boards. This provided a very dense package; generally impossible to repair, but with good heat transfer characteristics. It was known as [[Printed circuit board#Cordwood construction|cordwood construction]].
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
CDC 6600
(section)
Add topic