(NASACFZ-1S2/Ot
Total Page:16
File Type:pdf, Size:1020Kb
(NASACFZ-1s2/ot (1'AS-C-121~fPRELIMAINARY STUDY FOR A NUMERICAL AERODYNAMIC SIMULATION FACILITy. N78-19052 PHASE 1: EXTENSION Control Data Corp., St. Paul, Minn.) 434 p HC A19/MF A01 CSCL 01A Unclas G3/02 08630 PRELIMINARY STUDY FOR A NUMERICAL AERODYNAMIC SIMULATION FACILITY SUMMARY REPORT - PHASE 1 EXTENSION By: N. R. Lincoln FEBRUARY, 1978 Distribution of this report is provided in the interest of information exchang'e. Responsibility for the contents resides in the authors or organization that prepared it. Prepared under Contract No. NAS2-9457 by: CONTROL DATA CORPORATION Research and Advanced Design Laboratory 4290 Fernwood Street St. Paul, Minnesota 55112 for AMES RESEARCH CENTER NATIONAL AERONAUTICS AND SPACE ADMINISTRATION R EPVED ItA SI FACULWM SUMMARY REPORT - PHASE I EXTENSION Phase I of the NASF study which was completed in October 1977 produced several conclusions about the feasibility of construction of a flow model simulation facility. A computer structure was proposed for the Navier-Stokes Solver (NSS), now called the Flow Model Processor (FMP), along with technological and system approaches. Before such a system can enter an intensive design investigation phase several tasks must be accomplished to establish uniformity and control over the remaining design steps, as well as clarifying and amplifying certain portions of the conclusions drawn in Phase 1. In order of priority these were seen as: 1. Establishing a structure and format for documenting the design and implementation of the FMP facility. 2. Developing 'a complete, practically engineered design that would perform as claimed in the Phase 1 report. 3. Creating a design verification tool for NASA analysts, using a computerized simulation system. 4. Identifying key elements of the flow model three-dimensional codes to be used as metrics for verifying the progress of FMP design against a set of predefined system objectives. S. Developing a programming language specification for a proposed FMP language to be tested by Ames and RADL personnel. 6. Coding of the key elements in the experimental language. 7. Hand compilation of the encoded program segment. 8. Submission of the hand-compiled elements to the FMP simulator for timing and data flow analysis. 9. Documentation and "packaging" of the resulting simulator and input code to permit NASA personnel to continue experiments and analysis. 10. Development of sufficiently detailed functional descriptions as to permit NASA personnel to develop their own simulations where necessary. 11. Refinement of Reliability Analysis to include realistic estimates of component counts. These tasks were attacked with all of the Phase I personnel plus three additional designers and mathematicians. The major expenditure of time was in the revision and re-revision of the computer design and the production of a CPU Instruction Specification and Functional Specification which are included as appendices to the Final Report, and which were fundamental to all but the first task listed. An outline of the form and desired content of the basic specifications for software and hardware were produced as guidance and structure skeletons for future contract phases. The 3-D code analysis and language specification were not completed in this phase, as the language direction was changed several times in this admittedly preliminary study phase of the project. S-1 PROJECT STATUS AND CONCLUSIONS Most of the tasks undertaken in this extension phase were continuations of work initiated in Phase 1 of the NASF study. All of the tasks are expected to continue in some form as the full NASF architecture is defined in detail, and finally designed and implemented in subsequent phases of this project. The "best effort" engaged by Control Data Corporation has resulted in the following task status: 1. Hardware description a. Performance metrics - The three-dimensional implicit code was analyzed to determine if extrapolations from the two-dimensional code done in Phase I were still valid. It was found that the Ames guidance regarding the extrapolations were sound, and conclusions about the computational load invoked by the 3-1) code still hold from the Phase 1 report. No time was spent on the explicit code beyond determining the FMP instruction requirements for data dependent processing that differs in part from the implicit code behavior. The result was a decision to retain the APL vector operations of vector search and compare and the corresponding operations .of compress, mask and merge. Bit string operations on long strings were eliminated as an unnecessary complication to the hardware and supplanted by a scalar (non vector) form of handling bit string operations. The performance degradation due to this change was estimated at less than 1 percent for the entire code execution. Pertinent segments of the implicit code were identified for further study. The sequence of subroutines or subprocesses used in forming the tridiagonal matrices from sweeps in the three mesh directions, and the tridiagonal solver constitute the obvious key computa tional burden of the implicit code, while the memory accessing patterns for both Main Memory and Backing Store can be derived from the examination of the AMATRX, FILTRX, FILTRY, FILTRZ sequences. The conclusion is that for a "first-cut" validation of the FMP hardware architecture, a minimum of these subprocesses and the BTRI/ LUDEC (Block tridiagonal solver and LU decomposer for the solver) should be pro grammed, hand-compiled and simulated on the block-level simulator. In addition, the mathematical behavior of these sequences should be analyzed to determine if, in all practical cases, the arithmetic can be done in 32-bit mode to improve the throughput characteristics of this code in the 64/32-bit pipelines of the proposed FMP. A hand compilation of a segment of the beginning of the J sweep for the left hand-side solution was accomplished. The limited code generated was constrained by the time remaining after decisions were finalized on the FORTRAN extensions to be proposed. This region of the code is critical to the performance of the FMP since it exercises the memory system in the most inefficient manner possible in the implicit code. Data cannot be streamed directly into vector arithmetic operations, but first must be GATHERED from discontiguous columns of data in the original mesh. With the minimal resources remaining, it was decided that this sequence was the most interesting to hand-compile and block-model with the GPSS simulator. The sequence of instructions was submitted to the Version 1 FMP model, with the result that a floating-point rate of 933 megaflops was achieved under these worst-case conditions. The implication (which will be explored more thoroughly in the next study phase) is that, in fact, this small segment is truly the worst worst-case and thus the average computation rate will be well in excess of the one-gigaflop thresh hold sought by NASA. S-2 At the time of this writing, the hand compiling and block simulation have not been completed, but should be available at the time of submission of the final report. b. Functional design - Design of the originally proposed Control Data FMP was revised and a preliminary set of functional and instruction specifications was produced for the FMP. The major design changes involved two main thrusts: * Reduction of complexity of the Map Unit and Vector Unit by constrainingthe generality originally proposed for those units. * Improvements in reliability by increasing the amount of Single Error Correction Double Error Detection encoding throughout the Vector Units and associated buffers, the addition of more checking logic, and an additional pipeline for "instantaneous" swapping into the vector arithmetic ensemble as failures are detected. The technological risks of using a new circuit family in the FMP which were strongly emphasized in the Phase I report were re-examined and the decision was made to proceed on two concurrent paths in the development of the FMP. These were to pursue the new technological developments aggressively while at the same time, assessing the configurations and reiabilities of the FMP as if it were to be built out of existing ECL LSI as used in the STAR-100 program. It was decided to take an approach to the newer generation LSI that would not invalidate all of the architectural or block-level design, so that those tasks could proceed somewhat independently of the technology development. This decision thus permits Control Data to postpone the recommendation for logic family until much later in the design cycle, and thus delay consideration until more firm commitments can be made regarding the risks and schedules for a new logic family. It has been deter mined as a bottom line that an FMP can be built, and operated with acceptable reliability, from the existing ISI family being employed on the STAR-100A. The block-level simulation system required by Ames personnel to permit an ongoing verification of the design and the overall performance objectives is under development. A last minute decision was made to base the first, highest-level design model on the readily available General Purpose Simulation System (GPSS) so that it could be easily installed at any site performing FMP analysis. The first version of the simulator has been completed and is being delivered under separate cover to NASA Ames with the submission of this Final Report. The simulator provides a substantial amount of statistical data about the behavior of the various FM!P units (Swap, Map, Memory, Scalar, Vector) when executing code sequences. This data permits the designers to evaluate alternate strategies for organization and implementation of the major components of the FM. NASA analysts can use the data to verify that the internal characteristics of the FMP as documented in the Functional and Instruction Specifications are truly represented by the current version of the GPSS model.