First IAASS Conference "Space Safety, a New Beginning" 25 - 27 October 2005, Nice, France (ESA SP-599, December 2005)
Total Page:16
File Type:pdf, Size:1020Kb
IMPLEMENTATION OF PROGRAMMATIC QUALITY AND THE IMPACT ON SAFETY Dale T. Huls, Kevin M. Meehan National Aeronautics and Space Administration Johnson Space Center, Code OE, 2101 NASA Parkway, Houston, Texas 77058 [email protected] [email protected] ABSTRACT d) Affordability -- the product or service must be cost- effect. The implementation of an inadequate programmatic quality assurance discipline has the potential to The “quality” of a product or service can therefore be adversely affect safety and mission success. This is best determined by how well it satisfies the above criteria. demonstrated in the lessons provided by the Apollo 1 While quality is “achieved” by defining efficient and Apollo 13 Challenger, and Columbia accidents; NASA effective processes to design and fabricate parts or Safety and Mission Assurance (S&MA) benchmarking provide services, it is “assured” by verifying that those exchanges; and conclusions reached by the Shuttle processes are adhered to and remain effective and Return-to-Flight Task Group established following the efficient. Columbia Shuttle accident. Examples from the ISS Program demonstrate continuing issues with In the case of human space flight and, in particular the programmatic quality. Failure to adequately address ISS, quality assurance can be broken into two distinct programmatic quality assurance issues has a real areas. The first area is “quality control”, or the real- potential to lead to continued inefficiency, increases in time verification that certain activities have been program costs, and additional catastrophic accidents. satisfactorily completed in compliance with requirements. The second area, which is the focus of 1. INTRODUCTION this paper, is the “programmatic quality assurance function” (hereafter referred to as the QA function). 1.1. Purpose The QA function establishes the requirements and processes governing (a) the design and fabrication of The purpose of this paper is to provide a generic systems (hardware and software); (b) the assembly and perspective of how the implementation of an inadequate operation of those systems, both ground and on-orbit; programmatic quality assurance discipline has the (c) the identification, documentation, and resolution of potential to adversely affect safety and mission success. deviations from requirements; and (d) the oversight It is also to demonstrate that while the NASA “culture” function that assesses the overall effectiveness of those continues to focus on improving its safety processes and processes, adherence to requirements, and process organizations following the Columbia tragedy, an equal improvement. level of effort and management focus on improving the NASA quality assurance processes is essential for For most organizations that involve highly complex ensuring future safety and mission success of future high-risk endeavors, organizational responsibility for human space flight missions the QA function is typically structured such that: 2. BACKGROUND • Senior management establishes overall quality expectations, roles, and responsibilities; 2.1. Quality of Products and Services • Quality is achieved and maintained by those assigned responsibility for performing the work; There are essentially four basic criteria for determining and the success of a product or service: • Quality is verified, “or assured” by those not directly responsible for performing the work. a) Utility -- a product or service must perform as expected; 2.2. ISS QA Organizational Structure and External b) Reliability -- the product or service must be Interrelationships dependable when called upon; c) Safety -- the product or service must be safe for An example of NASA QA organizational structure is use; and demonstrated by the ISS Program. NASA’s QA function for the ISS Program resides within the ISS _____________________________________ Proc. of the First IAASS Conference "Space Safety, a New Beginning" 25 - 27 October 2005, Nice, France (ESA SP-599, December 2005) Safety & Mission Assurance/Program Risk (ISS “cultural” deficiencies in one program are often S&MA/PR) Office. Within the ISS S&MA/PR Office, common to the other programs at JSC. an ISS QA Manager and limited support staff has been assigned responsibility for overall development and The most visible and still most relevant examples of the implementation of ISS quality requirements. impact that deficient programmatic quality assurance processes can have on safety involve the four major The majority of the staff responsible for performing the accidents that NASA has experienced in its human ISS S&MA tasks are “matrixed” (i.e., assigned to space flight program – the Apollo 1 fire, Apollo 13, support) to the ISS Program from NASA institutional Challenger, and Columbia -- and the failure of organizations. In the case of the ISS S&MA/PR Office, corrective actions implemented in response to the first the majority of its safety, quality, and reliability support three accidents to prevent future failures. From these personnel are matrixed from the Johnson Space Center examples, one can draw a direct correlation between (JSC) S&MA ISS Directorate, as illustrated in Figure 1. poor quality implementation and a catastrophic incident. Personnel from other NASA Centers performing ISS In addition, from the corrective actions implemented S&MA functions are matrixed in a similar fashion. after each accident, one can see that the NASA “culture” continues to place an emphasis on “safety” NASA over “quality” without fully comprehending that while Administrator “safety” is a technical discipline (e.g., development of hazard assessments; failure modes and effects analysis), similar to the engineering discipline, quality is an Associate Administrator for “assurance” discipline that requires a different set of Human Spaceflight skills and experience to adequately oversee the overall effectiveness of other relevant disciplines, such as ISS Program NASA Center engineering and safety. Manager Director While the Columbia tragedy is still recent and corrective ISS S&MA/PR Support Center S&MA actions and recurrence controls are still being Manager Agreements Manager implemented by NASA, there is evidence that NASA has once again failed to grasp the importance of ISS QA QA Center S&MA programmatic quality assurance and its impact on safety Manager Personnel Personnel and mission success, as documented in Appendix A.2 of Figure 1. ISS Programmatic QA Structure the Final Report of the Return-to-Flight Task Group [1]. This report documents an independent assessment of NASA’s progress and effectiveness in resolving the The roles and responsibilities to be fulfilled by the findings identified by the Columbia Accident matrix personnel on behalf of the ISS QA Manager, as Investigation Board (CAIB) in its report following the well as the other staff matrixed to the ISS S&MA/PR Columbia accident, with Appendix A.2 documenting Office, are documented in agreements established dissenting opinions about the effectiveness of those between the ISS S&MA/PR Office and the JSC S&MA corrective actions and recurrence controls. Office. Another set of examples that provide a basis for In addition to being responsible for ensuring proper improving NASA’s “culture” with respect to quality implementation of quality assurance requirements assurance is contained within the benchmarking defined by the ISS S&MA/PR Office, the JSC S&MA assessments performed by NASA Headquarters’ S&MA QA personnel matrixed to the ISS QA Manager have organizations with other non-NASA government and also been delegated responsibility for developing most corporate organizations. of the various processes intended to satisfy those requirements. The primary QA lessons still to be learned from each of the major accidents, the benchmarking studies, and the 3. PROGRAMMATIC QA LESSONS LEARNED independent return to flight task group findings are summarized in the following subsections. Since JSC is home to the primary NASA human space flight programs, the JSC S&MA organizations that 3.1. Quality Lessons Learned from NASA support the ISS Program also support the other various Accidents human space flight programs, such as the Space Shuttle Program and the Crew Exploration Vehicle (CEV) Although “safety deficiencies” always seems to be the Project Office currently being established at JSC. primary culprits blamed by media in the aftermath of a Therefore, programmatic quality assurance and other major accident, a strong case can be made that it was the The Roger’s Commission, which investigated the failure of quality processes and the lack of adherence to Challenger accident, pointed out in its report [4] that: engineering, safety, and other processes that eventually led to major NASA human space flight accidents. For Quality Assurance is closely related to both safety and example, the Phillips Report [2] documenting the reliability. All NASA elements prepare plans and investigation of the fatal Apollo 1 fire cited quality institute procedures to insure that high standards of problems as a major contributor to the accident. The quality are maintained. To accomplish that goal, report charged that principles and procedures of elements charged with responsibility for quality configuration management were not followed, poor assurance establish procedural controls, assess workmanship was evidenced by continual high rates of inspection programs, and participate in problem rejection and rework, and that poor quality was identification and reporting. evidenced by a large number