<<

Mission Success First: Lessons Learned Overview

Thermal and Fluids Analysis Workshop TFAWS August 6, 2014 Cleveland, Ohio

Joe Nieberding Larry Ross

1 One Strike And You’re Out

2 '        Mission Success First: Lessons Learned Overview

Thermal and Fluids Analysis Workshop TFAWS August 6, 2014 Cleveland, Ohio

Joe Nieberding Larry Ross

3 “Mission Success First: Lessons Learned” Class Synopsis

What went wrong? How did it happen? Could it happen again? How can we avoid repeating the mistakes of the past? No one knows like the people who were there, and have the scars to prove it from personal involvement in space mission failures. The majority of aerospace mishaps can be traced to easily recognized, preventable root causes resulting from a lack of quality somewhere in the system. Most missions are lost to human error, not rocket science. Examining and understanding these causes for more than forty actual aerospace mission failures is critical to helping today’s designers of any highly complex systems, aerospace or otherwise, identify system specific lessons that must be learned. These lessons are not unique to programs or time. They apply across multiple aerospace and non-aerospace endeavors. The same mistakes are being made today that were made fifty years ago. Implementing specific strategies and project “Rules of Practice” early in a program is the best means of prevention. Recognizing why the lessons of the past were not learned is also a critically important step in solving the problem. The two day “Mission Success First: Lessons Learned”classis“wordsfromthe wise” aimed at further strengthening system quality standards by understanding why they broke down in the past, and what to do about it. This class is among NASA’s most highly acclaimed classes. The importance of the topic has been recognized by NASA and the United States Aerospace community through invitations to present this class more than sixty times in the United States over the past six years, as well as multiple times in Europe and Asia.

Joe Nieberding Larry Ross President, AEA Chief Executive Officer, AEA 4 '        Presenters Joe Nieberding:

After earning a B.S in in 1966 and an M.S. in Engineering Science in 1972, Mr. Nieberding has acquired over 45 years of management and technical experience in the aerospace industry. In his early career, he was a launch team member on over 65 NASA /Centaur and /Centaur launches at . He is a widely recognized expert in launch vehicles and advanced transportation architecture planning for space missions. Later, he led and participated in many independent program review teams for NASA Headquarters. Before retiring from NASA Glenn Research Center in 2000, under his direction the Advanced Space Analysis Office led all exploration advanced concept studies for Glenn, including transportation, propulsion, power, and communications systems for many advanced NASA mission applications. Since retirement, he has held numerous consulting positions for NASA and other government agencies. In addition, Mr. Nieberding is co-founder and President of Aerospace Engineering Associates, and co-author and presenter of a highly acclaimed class titled “Mission Success First: Lessons Learned”. He is the father of four children and a husband of 47 years. 5 '        Presenters (concluded) Larry Ross:

Mr. Ross has been a technical and management contributor in the aerospace industry for over forty eight years after having received a BS in electrical engineering from Manhattan College, Riverdale, New York City. His thirty-two year career at the NASA Lewis Research Center, now NASA Glenn, culminated in his assignment as Center Director from 1990-1994. Prior to that assignment he held the positions of Deputy Center Director, Director of Space, and Director of Launch Vehicles. Earlier in his career, he held various positions associated with engineering and program management of the Atlas/Centaur and Titan/Centaur Programs. He was chairman of the 178 Failure Review Board in 1986. Mr. Ross retired from NASA in 1995, and since that time has served as a senior consultant to NASA and other Government agencies, as well as to the commercial aerospace Industry. Mr. Ross is co-founder and CEO of Aerospace Engineering Associates. He is the father of four children and a husband of forty eight years.

6 '        Preface • It’s vital for any enterprise to make mission success an overriding imperative – Failure can mean loss of the enterprise! – Second chance outcomes (if any) depend on successfully learning the lessons of the first attempt failure • Since , NASA and the worldwide space community has a success rate of about 90% – Increased to about 95% over the last 25 years – But even a 5% failure rate is unacceptable and can be improved • An examination of space mission mishaps finds human error to be a dominant factor: – Its root causes are not unique to aerospace or to time – The same root causes are a threat in any endeavor • We analyze a representative sample of 43 cases to develop specific actions that would have defeated the human error involved – These “Rules of Practice” address systemic root causes and have applicability far beyond the specific cases from which they are derived – And far beyond the aerospace business These “Rules” emerge from “lessons learned the hard way” and will greatly help achieve Mission Success First! 7 '        Origin and Purpose of This Presentation • Began as effort to prepare (now cancelled) Non-Advocate Review Team – One hour presentation – much data compiled but unused – Later expanded to one and then two day presentation for wider audience • Purpose: to assist space system developers – Increase awareness of past mishaps and root causes – Help a new generation avoid the same pitfalls • Includes broad lessons learned – Multiple programs – Overarching fundamental lessons (generic) – Many specific examples of mishaps or mission failures • Observation: “root” causes not unique to times/programs – While some cases are from long ago, the relevance of the lessons is undiminished – Will be threats in any future development • Includes references for all resource information – Websites, failure reports, interviews, and subject matter experts • The “lessons” (yellow background charts) were either: – Developed independently by AEA based on analysis of the resource information, or – Extracted from the resource information

It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so. Mark Twain 8 '        Slide Box Color Key

Title

Text Slide for presentation

Lesson(s) Learned

(Supplementary Detail ) Slide not for presentation or (supplementary detail only)

Additional Lessons Detailed Text Learned

9 '        Outline DAY 1

• Introduction Milan Cathedral Tay Rail Bridge • Historic Failures Hyatt Regency Hotel Tacoma Narrows Bridge • Space Mission Record of Success (Abbreviated) R-16 ICBM Explosion • Management Practices (Abbreviated) – Solid Rocket Booster Project • What worked and what didn’t – Stephenson Report • The Culture of Testing (Abbreviated) • Lessons from Past Missions

– Screening Out Design Errors Galileo; STS-51/TOS/ACTS; WIRE

– Screening Out Procedural Errors AC-21; TC-1; TC-6; GPS IIR-3; NOAA N Prime

– Impact of Weak Testing Practices Hubble; MPL; Genesis

– Systems Engineering Lapses F-1; Skylab; X-43A; CONTOUR

– Software Mishaps MCO, MGS

– Flawed Processes Apollo 13 Explosion; AC-43

– Information Flow Breakdown B-2A; AA 191 – Component Failure AC-24

10 '        Outline (cont’d)

DAY 2 • Lessons from Past Missions (cont’d)

– Experienced Teams make Mistakes AC-67; Apollo 1; AC-62; TK 1951

– Normalizing Deviance Challenger; Apollo 13 POGO; Columbia

– Missed Advanced Warnings Launch Availability; AC-33; Disneyland Monorail; Titan IVB-32/Milstar & AC-45 – Perils of Heritage Systems AC-5; 501 – Sabotage MDCA Microgravity Experiment

– Management Factors Have Lost Missions N-1; MO; Helios; AA 96; TK 981 • Summary of Causes for the Past Mission Failures (Abbreviated) • The “Chain of Errors” Concept – The “Gimli Glider” – Loss of the X-31 • Two Lessons from a Different Perspective – From Space Station Freedom to the International Space Station – Flawed Failure Investigation of Atlas/Centaur 70

11 '        Outline (concluded)

DAY 2 (concluded) • Common Cause Failures • The Human Element • Applying the Lessons: “Rules of Practice” • Conclusions • Appendix A: Presentation History to Date • Appendix B: Glossary of Terms • Appendix C: Case History Information Sources

12 '        Historic Failures

Case Event

The Milan Cathedral Wall collapse

The Tay Rail Bridge Bridge collapse – 75 fatalities

Kansas City Hyatt Regency Skyway Skyway collapse – 114 fatalities

Tacoma Narrows Bridge Bridge collapse

Russian R-16 ICBM Pad explosion - >120 fatalities

13 '        Historical Perspective: Prominent Failures from Across the Spectrum of Engineering Endeavors (cont’d)

Tacoma Narrows Bridge Puget Sound Washington Opened 7/1/1940 Destroyed 11/7/1940

 * From Original Drawings 

Plate Girder Typical Video 14 '        Space Mission Record of Success

Sources: 1. http://www.sciencepresse.qc.ca/clafleur/Spacecrafts-index.html 2. http:www.starsem.com/soyuz/log.htm 3. http://en.wikipedia.org/wiki/atlas_V 4. FAA, Commercial Space Transportation, Year in Review, 2009, 2010, and 2011 5. http://www.ulalaunch.com/site/default.shtml 6. http://www.arianespace.com/launch-services- /Soyuz-Users-Manual-March-2012.pdf

15 '        Total Number of Spacecraft Launched, 1957 - 2011

Sponsor* Number % Russian 3595 50.5% American 1857 26.1% European 338 4.7% Chinese 169 2.4% Japanese 135 1.8% Indian 63 0.9% Canadian 35 0.5% Israeli 14 0.2% Other Government 156 2.2% Commercial 622 8.7% Amateur/Student 136 1.9% Total 7120 100%

*Sponsor means Spacecraft owner – not always the same as the entity launching it. 16 '        Worldwide Space Mission Success Rate by Decade, 1957 - 2012

  ! !   1600 1462 1401 1400 1270 1200 1042 1000

800 675 601 Success 600 Fail 400

232 165 Number of Space Missions 200 112 117 89 32 0 1957 - 1967 - 1977 - 1987 - 1997 - 2007 - 1966 1976 1986 1996 2006 2012 Decade

17 '        2012 Interim Update*

• Total launches – 78

– 74 Successes (including two partial successes by and

Falcon 9)

– 4 failures (two Iran, one North Korea, one Proton)

• 5% failure rate, consistent with recent history

*A more complete update will be added when source databases are updated

18 '        Perspective

• Space system reliability has improved dramatically over more than a five decade history • Of 7,198 total space missions launched through 2012, 747 failures occurred: 90% average rate of success – First decade: 72% success – Last 5 years: 96% success • But in 2011, success rate dropped to 92% – Six rockets, and 11 payloads, were lost: 4 upper stage propulsion failures, one vernier engine failure, and one shroud jettison failure – Failures consistent with historic causes • In 2012, success rate back up to 95%

• The material to follow places a magnifying glass on the relatively small fraction (but still too high!) of space mission mishaps

More is learned from failure than from success!

19 '        Management Practices

20 '        Background

• NASA in preparing for the challenges of the Constellation Program – What management lessons have we learned from large, tough programs? – Asked the managers of those programs: • What worked? • What didn’t work? • Other observations?

21 '        Historical Perspective on Management Practices that Work • Received total commitment from Center management – Essential for success – Periodic “tough” reviews showed management’s involvement – Accountability out in the open • Technical organization line management took responsibility for their people and their technical products – Assignment of accountability was visible and unambiguous – Avoided too many “teams” that can blur line management accountability • People were the key to success – couldn’t succeed without dedicated, strong and technically proficient leadership (NASA and contractor) – Picked Project Managers who had distinguished themselves technically – Line and higher management engagement drove technical excellence of products – Highly disciplined, open technical reviews went to great levels of detail

– Got the contractor involved early 22 '        Historical Perspective on Management Practices that Work (Cont’d) • Quality communication environment and practices are lifeblood of successful projects – Co-location is a powerful tool for enabling communication and rapid response (and team building) • Resident offices at the contractor’s plants are recommended to: – Improve communication and maintain speedy “high-fidelity” cognizance of key developments – Understand contractor issues and motivations – Exploit a training opportunity • Margins are the enablers of risk management (budget, schedule, and performance) • Without an early operations model driving the design, a “shoot-and-see” situation is inevitable

23 '        Mike Griffin: 2008 Aerospace Safety Advisory Panel Annual Report (emphasis added) [ASAP] “If you could write the “top five” goals for the new administrator, what would be on the list?”

[Mike Griffin] “Insistence on top-level technical and program management talent, as demonstrated by a track record of performance in the space business, as a precondition for holding any significant management position at NASA. Far too often in the past, numerous significant leadership positions at NASA have been filled by people whose primary qualification for the job was their relationship with those in control of the selection process. Far too often in the past, such top-level jobs have been, literally, the very first job these individuals had ever held in the space business. We spent almost 15 years conducting an experiment at NASA, an experiment whose purpose seemed to be to demonstrate that it was possible for people without relevant domain expertise to manage a highly technical agency. It did not work. We should not repeat it.”

24 '        An Aside – Our Recommendation • When recruiting team expertise at the Project’s outset, make sure to consider these disciplines:

− Operations − Tribology − System Safety − Reliability − Instrumentation − Quality − Materials & Processes − Test

• Experts in these areas often arrive too late (frequently when trouble has already arisen) – Thus their ability to impact the “product” is greatly diminished – Sometimes with negative consequences

25 '        Example: A Product Improvement- More Nutritious Dog Food

Video • Underlying Issue: In response to consumer complaints, the dog food industry had been searching for a way to make their dog food more nutritious • Problem: A major company invented a better formulation, but it unexpectedly resulted in widespread complaints of dogs becoming mysteriously ill • Impact: Costly recall; loss of company prestige

Source: Subject matter expert consultant: Mr. Thomas Halliday

26 '        Dog Owners Noticed a Slow, But Steady, Deterioration

Video Initial state: old formula New formula: Something seems wrong!

27 '        Degradation Accelerates

Increasing anger Loss of motivation (and change of breed)

28 '        Until: End State Deterioration!

Video

29 '        Improved Dog Food

• Investigation: After intense investigation by company chemists Video and independent reviewers, the cause could not be found until the manufacturer of the bags was asked to get involved • Resolution: The chemical bag liner was compatible with the old formula, but incompatible with the new formula, making the dogs sick • If the bag manufacturer had been involved at the start of the reformulation, the costly recall would have been avoided

Get the right people involved from the start!

30 '        Historical Perspective on Management Practices that Work (Concluded)

• A strong systems integration function is critical throughout development and operations

• Test-test-test is the first choice method of verification

– Preferably with full scale hardware (especially for qualification)

– Always duplicating expected flight conditions

– Always with careful test planning – don’t do dumb tests!

– Challenge and justify if verification is by analysis only

31 '        “Space Shuttle SRB/RSRM Project – A Management Historical Perspective” – Observations Offered by Participants (concluded)

• A strong systems integration function is critical throughout development and operations

• Test-test-test is the first choice method of verification

– Preferably with full scale hardware (especially for qualification)

– Always duplicating expected flight conditions

– Always with careful test planning – don’t do dumb tests!

– Challenge and justify if verification is by analysis only

32 '        The Stephenson Report

• MCO Board produced second report following Mishap Investigation to: – Derive lessons from MCO and other missions – Create formula for future mission success • Offers a new vision for NASA programs and projects • Mission success is to become the highest priority • Among its many excellent recommendations, the report observed that successful projects make testing a very high priority!

Report Available At: ftp://ftp.hq..gov/pub/pao/reports/2000/mco_mib_report.pdf 33 '        Report on Project Management in NASA (concluded) Commentary on Better Faster Cheaper and Testing (emphasis added)

“As implementation of this strategy [Better Faster Cheaper] evolved, however, the focus on cost and schedule reduction increased risk beyond acceptable levels on some NASA projects. Even now, NASA may be operating on the edge of high, unacceptable risk on some projects.”

“The Board finds that implementation of the “Faster, Better, Cheaper” philosophy must be refined at this stage in a new context: Mission Success First.”

“This vision, Mission Success First, entails a new NASA culture and new methods for managing projects. To proceed with this culture shift, mission success must become the highest priority at all levels of the program/project and the institutional organization. All individuals should feel ownership and accountability, not only for their own work, but for the success of the entire mission.”

Regarding Successful Teams: “Catching errors early and correcting them is a high priority for these teams. During project planning, they advocate development of prototype versions and early testing to uncover design errors, especially for higher-risk components. They perform comprehensive unit testing and are intimately involved with systems integration testing. Their philosophy is, “Test, test and test some more.” Their motto is: “Know what you build. Test what you build. Test what you fly. Test like you fly.” Mission Success First! 34 '        Conclusions

• Key management factors – Support from the top – Leaders with demonstrated relevant experience – A culture that values clear lines of responsibility – A culture with a set of bedrock principles (e.g. thorough testing) • These factors comprise the framework within which a successful enterprise proceeds! • They can be learned from past successes

35 '        The Culture of Testing

36 '        The Culture of Testing: Centaur Program Development Testing Atlas Centaur 10:1 Model in NASA Lewis Atlas Centaur Dynamic Testing Centaur Balanced Thrust H Vent Test Rig 2 10X10 Supersonic Wind Tunnel NASA Lewis – Plum Brook Station NASA Lewis – Cleveland Vent Fin Flow Studies

Various Centaur Fairing Testing Atlas Centaur Separation Test NASA Lewis – Cleveland NASA Lewis – Cleveland

Source: “Revolutionary Atmosphere – The Story of the Altitude Wind Tunnel and the Space Power Chambers”. Robert S. Arrighi, 37 '        April 2010. NASA SP-2010-4319. Video

Original Altitude Wind Tunnel (AWT)

B-29 Wright 3350 Engine in AWT

AWT Converted to Space Power Chambers1 & 2 Source: “Revolutionary Atmosphere – The Story of the Altitude Wind Tunnel and the Space Power Chambers”. Robert S. Arrighi, April 2010. NASA SP-2010-4319. 38 '        The Culture of Testing (cont’d): Apollo Spacecraft System Development Testing

• Excerpts from “Apollo Spacecraft”– A paper by George M. Low* – “Major factors contributing to spacecraft reliability are simplicity and redundancy in design; major emphasis on tests; a disciplined system of change control; and closeout of all discrepancies.” – “The single most important factor leading to the high degree of reliability of the Apollo spacecraft was the tremendous depth and breadth of the test activity.” – “…let us look at only those tests involving complete spacecraft or boilerplates.”

FULL SCALE SPACECRAFT DEVELOPMENT & QUALIFICATION TESTING Escape Motor Flight Tests 7 Flights Parachute Drop Tests 40 Drops Command Module Land Impact Tests 48 Tests Command Module Water Impact Tests 52 Tests Command & Service Module Acoustic/Vibration Tests 15.5 Hours Command & Service Module Modal Survey Testing 277.6 Hours Command & Service Module Thermal Vacuum Tests 773 Hours Service Module Propulsion System Tests 1474.5 Minutes

– “Each of these tests taught us more about our spacecraft – their strengths and their weaknesses.” – “But most important of all, these tests gave us a tremendous amount of time and experience on the spacecraft and their systems.”

*May be found at www.klabs.org

39 '        One Small Step (Video)

40 '        The Culture of Testing (cont’d): Easy for you to say – “they” always cut it back for budget reasons! • What to do? – Start off right • Define technically comprehensive test program up-front • Then consider backing off as you reasonably can in the clear of day – Maybe de-scope some testing by (for example) • Backing off on some requirements, e.g. performance • Providing more margins, or more redundancy • Judiciously using “test anchored” modern simulation techniques • Focusing on interfaces, or areas historically problematic • Using Probabilistic Risk Assessment (PRA) derived relative risks as a guide • Appropriately using heritage, qualified systems • Depending on judgment by experienced, senior engineers • Have the reduced plan vetted and defended by respected, seasoned veterans • Then, grind it into the program’s cost (with some margin) and fight to protect it from the buy-in salesmen

41 '        The Culture of Testing (cont’d):

− Once established, make it VERY, VERY DIFFICULT to change • Manage any budget driven descoping as a technical issue − Engineering leadership (not management/financial staff) must decide if and/or what to descope − Recycle through descoping options, above − Basis will always be risk tolerance − Be prepared to draw a line in the sand!

There may come a point where good, courageous engineering judgment says: “Further weakening of the planned test program is just too risky. If we can’t justify the expense, we can’t afford to do this.”

42 '        The Culture of Testing (concluded)

LESSONS • Successful programs have been anchored in testing • Thorough, well-vetted test plan a must • Testing must be the first choice method of verification • But can’t always test as much as we want • Challenge verifications not based on test • Require IV&V for any mission critical verifications based, in any part, on analysis alone • Nearly every test reveals something unexpected • Test, Test, Test! • Test like you fly and fly like you test

43 '        Lessons from Past Missions

Case Themes

1. Screening Out Design Errors 2. Screening Out Procedural Errors 3. The Impact of Weak Testing Practices 4. Systems Engineering Lapses 5. Software Mishaps 6. Flawed Processes 7. Information flow breakdown 8. Component Failure 9. Experienced Teams make Mistakes 10. Normalizing Deviance 11. Missed Advanced Warnings 12. The Perils of Heritage Systems 13. Sabotage 14. Management Issues Have Lost Missions

44 '        Screening Out Design Errors

Case Event

Antenna rib attachment lubrication design was compromised by Galileo handling/flight vibration leading to failure of High Gain Antenna to deploy

Superzip firing circuit design error caused Orbiter damage upon TOS STS-51/TOS/ACTS separation system Super Zip firing

Gate array protective circuit incorrectly implemented causing premature WIRE cryostat cover jettison and loss of primary instrument cryogen resulting in mission failure

45 '        A Quick Aside About Design Error “Screens”

GIVEN: Design Error Our design “machine” (humans) “Screens” WILL produce errors at some >0 rate

Test Design Error

Design Review

Unexpected Behavior

“Engineers today, like Galileo three and a half centuries ago, are not superhuman. They make mistakes in their assumptions, in their calculations, in their conclusions. That they make mistakes is forgivable; that they catch them is imperative.” (1) (1)“To Engineer is Human”; Henry Petroski, Vintage Books, 1992

46 '        STS-51 TOS/ACTS Separation Band Anomaly

• Underlying Issue: Design error made it to Video flight • Problem: STS-51 Orbiter damaged by debris from ruptured separation joint upon ACTS/TOS deployment (9/12/93) • Why: Improper design of Super Zip firing circuits. • Impact: Damage to Orbiter – Nearly impacted flight/crew critical equipment: • One piece penetrated Orbiter aft bulkhead blanket and caused a 1/8 x 1/2 inch hole in bulkhead – Other debris caused: • At least 9 tears in cargo bay insulation blankets • 3 gouges in wire tray covers • Possibly a gouge in a TPS tile 47 '        STS-51 TOS/ACTS Separation Band Anomaly (cont’d)

48 '        STS-51 TOS/ACTS Separation Band Anomaly (cont’d)

Should Be* Was*

Cord A Cord A Cord B Cord B

Primary Backup Primary Backup Detonator Detonator Detonator Detonator Block Block Block Block Backup Backup Backup Backup Primary Primary Primary Primary TOS Cradle TOS Cradle Orbiter     Orbiter    

Primary Secondary Primary Secondary Firing Firing Firing Firing Switch Switch Switch Switch

28 VDC 28 VDC 28 VDC 28 VDC Orbiter Standard Switch Panel Orbiter Standard Switch Panel 49 '        * Details courtesy of Dan Tani, NASA JSC Astronaut, who previously served as the ACTS/TOS Flight Operations Lead for the Orbital Sciences Corporation. STS-51 TOS/ACTS Separation Band Anomaly (cont’d) • The error: – Interface drawing terminology* caused Orbiter Standard Switch Panel to be wired such that: • Primary firing circuit switch was connected to “primary” detonator block and, • “Secondary” firing circuit switch was connected to “backup” detonator block – Interface requirement unspecified or overlooked: “Each Standard Switch Panel switch shall fire only one explosive cord.” – Improper Testing • Verified that system was built to print (it was – but the print faithfully reflected an improper design!) • Did not verify functionality with respect to the design intent – Phenomenon experienced was a well known vulnerability of the Super Zip system – Lack of single end-to-end schematic was a factor Video • Good fortune: Orbiter systems unaffected and TOS/ACTS completed the mission!

*TOS used the terms “Primary” and “Backup” to refer to the two ends of the same explosive cord and not to distinguish between the first and second firing commands.

Source: http://www.nasa.gov/offices/oce/llis/0312.html; http://www.nasa.gov/offices/oce/llis/0312.html NASA Public Lessons Learned Information System, Lesson #0312 and input from Dan Tani of NASA JSC 50 '        STS-51 TOS/ACTS Separation Band Anomaly (concluded)

LESSONS: • Be extremely vigilant when implementing any interface – historically, this is a “hot spot” for mistakes! • Never implement an electrical interface based on nomenclature • Verify compliance with design intent by examining the entire circuit • Produce end-to-end (cross interface boundaries) schematics for all electrical systems • Test procedures should be based on functional requirements whenever possible • Not just to verify “built to print” • Observation: one price of redundancy is added complexity, and complexity invites mistakes

51 '        AC-34: Mariner Venus/Mercury - Mariner 10

• Launched successfully: November 3, 1973 • Underlying Issue: Error in implementing spacecraft interface to • Problem: Spacecraft X-axis polarity reversed in launch vehicle ascent trajectory simulation – Caught by simple check 3 years before launch Video • Potential impact if undetected: Damaged Venus in real color Venus in ultraviolet spacecraft instruments • Actual impact: None – Mission successful – Ended March, 1975, after one Venus and three Mercury passes – First dual planet mission - 12,000 images Mercury in real color • First clear pictures of Venusian clouds • First use of gravity assist • First mission to Mercury

Source: Subject matter expert: J. Nieberding, lead launch vehicle mission analyst 52 '        AC-34: Mariner Venus/Mercury - Mariner 10 (cont’d) Simple spacecraft model Pegs on model correspond to spacecraft instruments

53 '        AC-34: Mariner Venus/Mercury - Mariner 10 (cont’d) • Spacecraft instruments must not point toward sun – Centaur coast phase roll attitude so programmed • Trajectory listing reviewed – Listing attitude was manually compared with expected attitude as determined through observation of model and globe relative geometry – Comparison revealed problem • Trajectory simulation code had the spacecraft X-axis polarity reversed! – If undetected, serious spacecraft damage likely 54 '        AC-34: Mariner Venus/Mercury - Mariner 10 (concluded)

LESSONS: • Be extremely vigilant when implementing any interface (between systems, contractors, etc.) – historically, this is a “hot spot” for mistakes! • Should be focus for testing • Simple checks can be very effective • If “It” doesn’t pass a simple test, “it” may be wrong • Must understand the real world physics, not just the

55 '        Screening Out Procedural Errors

Case Event

Atlas Centaur 21 Improper assembly of latch fitting led to failure of nose fairing to jettison

Foreign object jammed LOX boost pump leading to failure of Centaur Titan Centaur 1 engines to start Separation of Titan Stage 2 oxidizer tank autogenous pressure inlet line Titan Centaur 6 baffle resulted in degraded engine performance Operations crew failed to follow procedure for protecting spacecraft leading GPS IIR-3 to significant on-pad rain damage Breakdown of factory discipline caused severe spacecraft damage during NOAA N Prime procedure-challenged ground handling

56 '        The Challenge of Error-free Operations

Since manufacturing, assembly and integration are largely human based activities, error rates will be >0

OBSERVATION

INSPECTION P# $#Q

TEST

PERFORMANCE 57 '        Sources of Procedure Violation

• Complacency • Absence of motivation to do • Excessive schedule pressure things right • Culture that tolerates lack of – Lack of caring discipline – Lack of pride • Poor working conditions • Improper tools • Ignorance of error consequences • Weak supervision • Unclear or Outdated Documentation • Lack of current training

Always do right. This will gratify some people and astonish the rest. Mark Twain 58 '        Lack of Current Training – An Example

59 '        National Oceanic & Atmospheric Administration N Prime Satellite

• Underlying Issue: Procedure violation caused costly spacecraft damage • Problem: Spacecraft fell from turnover cart during processing (9/6/2003) • Impact: Spacecraft heavily damaged – launch delayed years • Why: Gross procedure violation – Turnover carts common (with adapter use) between DMSP and NOAA N – DMSP crew decided to use NOAA N configured cart • Began removal of NOAA adapter • Decided to use another cart after removing the 24 mounting bolts • Did not flag – NOAA N crew neglected required check of configuration 60 '        National Oceanic & Atmospheric Administration N Prime Satellite (cont’d)

• The review board found: – Violation of well established ground handling procedures • Failure to verify configuration integrity before using cart • Complacency from familiarity • Inspector bought off configuration check without inspection • Issue driven, not proactive, inspections – Blithe dismissal of technician’s query about open bolt holes – Failure of LM and the government to correct long standing factory discipline problems

Source: http://klabs.org/richcontent/Reports/Failure_Reports/noaa/65776main_noaa_np_mishap.pdf; 61 '        NOAA N-Prime Mishap Investigation, Final Report National Oceanic & Atmospheric Administration N Prime Satellite (concluded)

LESSONS: • Failure of routine ground handling procedures can be very costly (not only in $ - NASA/company reputation is a valuable commodity!) • Demand a culture of disciplined behavior around flight hardware • Make teams alert to the need for careful interactions: • Lead persons should seek to understand why they are being asked certain questions • Junior people should try to be very explicit when they are questioning something • Hold VP’s/managers/supervisors accountable for maintaining a safe and disciplined work environment • Watch out for complacency with anything routine

62 '        Systems Engineering Lapses

Case Event Flawed weather shield design resulted in structural failure under transonic Atlas Centaur F-1 buffet loads Orbital Workshop Micrometeoroid Shield structure failed due to burst Skylab pressure induced displacement of auxiliary tunnel during ascent Pegasus flight control system design inappropriate for flight profile flown X-43A leading to loss of attitude control Flawed design of solid rocket motor installation led to structural failure of CONTOUR spacecraft due to plume heating

63 '        Systems Engineering - General Thoughts

• Failure to practice effective systems engineering is the root cause in 22 of 43 cases analyzed • Why? – Often due to a culture that doesn’t value it and/or – Assignment of too few experienced practitioners and/or – It’s a tough discipline to consistently do right! (especially when things are going well.) • It is clear, however, that development programs will inevitably fall into serious trouble without a competent systems engineering function • “A necessary condition for mission success in all spaceflight programs is a robust, experienced systems engineering team and well thought-out systems engineering processes”.* * Report on Project Management in NASA by the Mars Climate Orbiter Mishap Investigation Board, March 13, 2000.

64 '        If Systems Engineering Didn’t Harmonize The Disciplines

Mechanical Controls Maintenance Group Group Avionics Group

Weights Group The Ideal Airplane As Seen By The Various Engineering Armament Group Stress Group Computer Aided Design Disciplines Hydraulics Group Group

Wing Group

Production Engineering Group

Equipment Group

Fuselage Group Power Plant Group

Aerodynamics Group 65 '        Empennage Group CONTOUR

• Underlying Issue: Erroneous prediction of spacecraft thermal environment • Problem: Spacecraft broke up following SRM firing (8/15/2002) • Impact: Loss of mission

66 '        CONTOUR (cont’d) • Why: Spacecraft overheating caused by improper installation of a “heritage” SRM – Inadequate systems engineering process – Inappropriate reliance on analysis by similarity – Inadequate review function – Dubious decision to omit telemetry coverage of motor firing event – Inadequate oversight, insight, and review of subcontractors – Inadequate communications between APL and ATK – ATK models not specific to CONTOUR – Limited understanding of the SRM plume heating environments in space – Limited understanding of CONTOUR SRM operating conditions

Source: Contour Mishap Investigation Board Report, May 31, 2003; http://klabs.org/richcontent/Reports/Failure_Reports/contour/contour.pdf 67 '        CONTOUR (concluded)

LESSONS: • Heritage designs must be re-qualified for new applications • Systems engineering is absolutely vital to mission success – in this case it should have: • Challenged the flawed heritage assumption • Objected to the use of invalid models • Insisted on a more complete understanding of SRM plume heating • Involve subcontractors early in the design process • They need to understand and “buy in” to how their product is integrated

68 '        Meteor Crater

Visitor Center

Crater

69 '        Flawed Processes

Case Event Command Module LOX tank exploded due to in damaged internal Apollo 13 Explosion wiring Improper processing of Atlas booster engine hot gas ducting caused leak Atlas Centaur 43 and in-flight explosion

70 '        Atlas Centaur A/C-43 • Underlying Issue: Third tier vendor’s processing error caused loss of mission • Problem: Vehicle destroyed by Range Safety (9/29/1977) • Impact: Loss of Intelsat IVA mission • Why: Explosion in Atlas engine compartment – Atlas engine hot gas leak – Hot gas plumbing joint improperly brazed (carburized) at third tier vendor • Resulted in corrosion-induced structural failure • Root cause only found after water recovery of hardware Source: Atlas/Centaur AC-43 Failure Investigation Final Report, December 1977, Report No. CASD/LVP 77-093, Contract NAS3- 71 '        19154, General Dynamics, Convair Division (NASA GRC Archives) Atlas Centaur A/C-43 (cont’d) - Video

72 '        Atlas Centaur A/C-43 Booster Hot Gas System

Atlas Centaur AC-43 Failure Investigation Final Report, December 1977, Report No. CASD/LVP 77-093, Contract NAS3-19154, General Dynamics, 73 '        Convair Division (NASA GRC Archives) Atlas Centaur A/C-43 Recovered Hardware Booster Hot Gas System Manifold and “” Section

Source: Atlas/Centaur AC-43 Failure Investigation Final Report, December 1977, Report No. CASD/LVP 77-093, Contract NAS3-19154, General Dynamics, Convair Division (NASA GRC Archives) 74 '        Atlas Centaur A/C-43 (cont’d)

LESSONS: • Pay attention to Materials and Processes vulnerabilities at all tiers • Identify vulnerabilities during design • Get the experts involved • Manage the vulnerabilities by taking such steps as • Destructive examination of “witness” hardware • Random analysis of flight hardware • Plant audits • Keen attention is required when suppliers/processors change

75 '        Information Flow Breakdown

Case Event

B-2A Improper calibration of air data sensors led to loss of aircraft

AA 191 Use of inappropriate procedure caused engine separation and loss of aircraft

76 '        B-2A Crash

VIDEO

• Underlying Issue: Critical maintenance information not communicated • Problem: Loss of control following takeoff rotation (2/23/2008) • Impact: Loss of B-2A aircraft ($1.4B) • Why: Improper maintenance of aircraft’s air data system

Source: http://www.acc.af.mil/shared/media/document/AFD-080605-054.pdf Summary of Facts, B-2A S/N 89-0127 20080223KSZL501A, Floyd L. Carpenter, Maj General, USAF, President, Accident Investigation Board 77 '        B-2A Crash (cont’d)

• Background: The B-2A Flight Control System – Computes altitude, airspeed, Angle of Attack (AOA), and Angle of Slide-slip – 24 Port Transducer Units (PTU’s) provide input data – Upon power-up, each PTU output compared with the average for all 24 • Calibration required if deviation is out of specification, • Calibration biases any deviating PTU into specification • Guam deployments entailed unusual number of deviations – Found to be caused by high humidity • Corrective action was to activate PTU heaters during the PTU check • Corrective action never documented or added to “Lessons Learned” log

78 '        B-2A Crash (cont’d)

• Pre-Flight of the “Spirit of Kansas” –“Air Data Calibration Required” message appears as Flight Control System (FCS) powered up • Output of 3 PTU’s deviate from specification (due to moisture) – Responding technician unaware of the undocumented PTU heaters- on procedure • Performs calibration procedure without PTU heaters on • 3 PTU’s are biased to null out the moisture induced deviations 79 '        B-2A Crash (cont’d)

• Flight of the “Spirit of Kansas” – Pilot activates PTU heaters (normal pre-flight procedure) and begins takeoff – 3 moist PTU’s dry out • Still biased outputs now deviate significantly from corrected value causing: – Airspeed indication 12 knots greater than actual – Angle of Attack indication of -8 deg – Pilot rotates at 12 knots below correct rotation – FCS reacts to false -8 deg AOA by commanding full nose up • Aircraft AOA goes to 30 degrees nose up • Low airspeed and high AOA cause stall and loss of control

80 '        B-2A Crash (concluded)

LESSONS: • Flow of information within and among organizations is essential for safety • Procedure revisions must be formalized and documented • Operations personnel need to possess a thorough knowledge of their systems • Here key personnel believed that the FCS calibration merely a barometric pressure correction • They were unaware of the potential to cause catastrophic errors!

81 '        Experienced Teams Make Mistakes

Case Event

Atlas Centaur 67 Loss of vehicle control following lightning strike during ascent

Fire in Command Module due to ignition of combustible materials – ignition Apollo 1 source unidentified

Atlas Centaur 62 Loss of Centaur attitude control due to leak in LOX tank

Crew did not adequately monitor primary flight instruments and mis- TK 1951 performed the recovery from approach-to-stall procedure – aircraft crashed

82 '        Atlas Centaur AC-67

• Underlying Issue: An extremely experienced launch team made a serious error in judgment • Problem: Vehicle broke-up after lightning strike during ascent (3/26/87) • Impact: Loss of FleetSatCom spacecraft

In certain times of stress and dire circumstance profanity provides a relief denied even unto prayer. Mark Twain 83 '        Simplified Physics of Triggered Lightning

VIDEO • Lightning can be triggered when an aerospace vehicle with a conductive surface and an ionized exhaust plume distorts the electric field equipotential lines, thus increasing the potential gradient at the top of the vehicle and below the exhaust plume.

Source: NASA Analysis of Apollo 12 Lightning Incident; Feb 1970 84 '        The Mid-Level Cloud Rule*

• Rule: “The flight path of the vehicle should not be through mid-level clouds 6,000 feet or greater in depth, when the freezing level is in the clouds.” – This rule, in the launch commit criteria as written on launch day: • Had no title or rationale • Was not identified as related to detection of a hazard for triggered lightning – Today’s version of the rule is in a section titled “Lightning” • Mishap investigation board findings: − “In pre-launch discussions on Channel 20 (Launch Director and Project Coordination Loop) both launch and weather team personnel appeared to believe that the constraint was an icing, rather than an electrical concern, which was discounted after two aircraft in the area reported no visible icing.” – “There was no convincing evidence that one of the criteria used to avoid potential electrical hazards (the mid–level cloud rule) was met; no waiver was processed” • The launch team believed that the constraint was met

*Source: Atlas Centaur (AC-67) Lightning Strike Mishap 1987; Leadership VITS Meeting, March 5, 2007; Bryan ’Connor, 85 '        Chief, Safety and Mission Assurance Atlas Centaur AC-67 (cont’d) • What happened: – Vehicle ascent into charged atmosphere triggered discharge • Caused bit flip in flight computer • Loss of control leading to excessive aero loading • Flight computer recovered in shallow water – Cable shields terminated in box – increases vulnerability to transients – Inadequate weather monitoring system • USAF weather gave “Go for weather - no constraints violated” – Four times during the count (as late as T-60 seconds) • Weather balloon data misinterpreted by USAF weather personnel – No positive indication of compliance with mid-level cloud rule • “Unofficial” instrumentation indicated potential for lightning • USAF role and authority in launch commit somewhat ambiguous – Atlas/Centaur launched successfully in the rain several times before • Rain alone was not a constraint – Bad judgment by an experienced launch team • Should have erred on the side of caution • “Launch Fever” a factor

Source: http://nsc.nasa.gov/SFCS/Index/SortBydate/Descending/Page6; Atlas Centaur (AC-67) Lightning Strike Mishap 1987; Leadership VITS Meeting, March 5, 2007, Bryan O’Connor 86 '        Learning the Hard Way

Vehicle Launch Date Conditions Incident Outcome Cold Front Vehicle Triggered Minor Instrumentation Charged Atmosphere Lightning at T+36.5 & Apollo 12 Nov 14,1969 Loss Light Rain T+52s Static Electrification Atlas Centaur 38 May 13, 1976 Charged Atmosphere Corona Discharge at Minor Data Loss T+48s Cold Front Vehicle Triggered Charged Atmosphere Loss of Vehicle Atlas Centaur 67 March 26, 1987 Lightning at T+48s Heavy Rain

LESSONS: • Reaction to any warning sign must always be: • Formally managed, thorough and made “official” • Effective over time (impervious to staff turnover) • Communicated across programs

87 '        Atlas Centaur AC-67 (concluded)

LESSONS (concluded): • Launch commit constraints need to be explicitly defined and include the underlying rationale • Make sure all key personnel understand them • Validate readiness of mission critical supporting organizations • Even when these are part of another agency, e.g. USAF • Monitor mission critical operations with experienced and objective (dispassionate) personnel • Define unambiguous roles and responsibilities • A very experienced launch team can make a serious error in judgment • Neglected to use plain common sense

88 '        Experienced Teams Can Make Mistakes LESSONS: • Having a long successful track record and thus a highly competent team is a source of pride, but also a threat • The threat can be a creeping growth in complacency and lack of wariness • Tough to detect • Tough to defeat • Can be manifest in a sense that, since everything seems to be working OK, it will continue to do so • All weaknesses in policy, practices, and procedure must have been driven out by this point • This sense leads to diminished alertness and a disinclination to probe, penetrate and question

Familiarity breeds contempt and children. Mark Twain 89 '        Normalizing Deviance

Case Event

Challenger Orbiter breakup

Insufficient Stage Il longitudinal dynamic stability margin led to Apollo 13 POGO diverging POGO and premature shutdown of center engine

Columbia Orbiter breakup

90 '        Challenger and “Normalization of Deviance” Based on “The Challenger Launch Decision” Risky Technology, Culture, and Deviance at NASA By Diane Vaughan The University of Chicago Press © 1996 by the University of Chicago Press

Source: http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/rogers-commission/table-of-contents.html 91 '        Normalization of Deviance – The Concept • An undertaking proceeds from an initial set of expectations (normal) • At some point something happens that deviates from the set of expectations (abnormal) • The participants expend serious effort to reconcile the unexpected deviation with an amended, but safe, set of expectations (normalization of the deviation) • Given a revised, but acceptable, set of expectations (new normal) the undertaking again proceeds • Success reinforces the normalization until a new deviation occurs • The process repeats until the aggregation of the deviances results in failure

92 '        Additional Sources

NASA SP-4313 “Power to Explore” A History of Marshall Space Flight Center 1960 – 1990 By Andrew J. Dunbar and Stephen P. Waring The NASA History Series 1999

And

The Rogers Commission Report http://history.nasa.gov/rogersrep/51lcover.htm

93 '        SRM General Arrangement

94 '        SRM Field Joint

95 '        SRB Joint Design Development

• Initial design expectations: − O-rings redundant during ignition transient − Tang – clevis rotation will compress O-rings • 1977 hydroburst test & analysis findings − Joint rotation opposite to expectation − Relieves O-ring compression − Causes undesirable extrusion mode sealing of primary − Secondary could unseal during part of ignition transient

96 '        SRB Joint Design Development (concluded)

• MSFC engineers at odds with Thiokol counterparts • MSFC management relied heavily on Thiokol expertise − Some fixes incorporated • Higher quality O-rings • Shimming to improve initial compression − Static motor firings confirmed Thiokol position

97 '        Summary Pre-Challenger Joint Performance

P EE

S

P

S

P E H

S

P HH E EA

S

LRLRLRLRLRLRLRLRLRLRLRLR

+ GGG #&#'( #&#' #&#'! #&#') #&#', #&#' #&#' #&#' #&#' #&#')(- #&#')( #&#')(. . GGG )*(* ( ((*(* ( !**  **  ((*((*  )*)* ! *( * ! *!* ! ((* * ! *!* ) )** ) *!* ) 1  &3  + +  + +  + + + !+ + ,+ !+ +

P A

S

P A B

S H

P BE

S

P BEEAEAAHE EAEBE

S E

LRLRLRLRLRLRLRLRLRLRLRLR

+ GGG #&#')(/ #&#',( #&#',( #&#',(. #&#',(- #&#',(/ #&#',(+ #&#',(0 #&#',(1 #&#'( #&#'(- #&#'( #&#',( . GGG (*,* ) ((* * ) (*)* , )*(* , )* * , *(* , * * , ** , (*!* , (*!* , ((** , (*(*  (* *  1  &3   + + ,!+ + ,+ + (+ +  + ,+ + , + !(+

Legend: E-erosion; B-blowby; A-erosion and blowby; H-heating; *-hardware not recovered; denotes first occurrence 98 '        Field Joint Temperature vs. Anomalies

3  51-C

61-A 2 

51-L Challenger

41-B 41-C  41-D 1   61-C  STS-2 Flights with No

Anomalies   0 Number of Field Joint Anomalies Field Joint of Number     30 35 40 45 50 55 60 65  70 75 80 Calculated Field Joint Temperature F

• Flights launched with joint temperatures > 65 F: 17.6 % had anomalies • Flights launched with joint temperatures < 65 F: 100 % had anomalies

99 '        Key Normalization Events Event Prevailing Normal Deviation Rationale New Normal - Satisfy all design requirements R&D - No erosion Joint rotation Static hot fire Joint rotation - No blow-by compromised testing OK – Thiokol compromises are an 1977 - No failure to seal primary design intentions has the expertise accepted risk* - No impact on secondary - Satisfy all design requirements - No erosion STS-2 Analysis ensures - No blow-by Erosion Bounded erosion erosion margin 11/12/1981 - No failure to seal primary - No impact on secondary - Satisfy all design requirements - No erosion Some blow-by – no STS 41D Self-limiting - No blow-by Blow-by impact on phenomenon 8/30/1984 - No failure to seal primary secondary - No impact on secondary - Satisfy all design requirements Primary erosion in - No erosion experience base Blow-by and Blow-by and no STS 51C - No blow-by and sealed before secondary erosion of - No failure to seal primary damaging 1/24/1985 impingement secondary - No impact on secondary secondary (self limiting) - Satisfy all design requirements Primary never - No erosion Primary burn STS 51-B sealed; 100 leak No change - No blow-by through; secondary check failed to (idiosyncratic event) 4/29/1985 - No failure to seal primary severely eroded detect - No impact on secondary

100 '        *A formal NASA risk management designation STS 51-L Factors (Hindsight is 20/20) • Fragile and poorly understood joint design • Contributors – Putty voids forming hot gas jets • Imperfect application • Damage from leak checks – Joint rotation • Relieves O-ring compression • Opens secondary – Case reuse and resulting deformation – O-ring temperature effects (31° F) – Absence of guaranteed redundancy – Leak check adversely displacing primary O-ring – Affirmation of acceptability through record of non-catastrophic flights – Subtle influence of proclaiming “operational” status • The culture painstakingly, but unconservatively, rationalized

and normalized each sign of deviation from the expected 101 '        Redesigned SRM Field Joint

102 '        A Few Lessons From the Challenger Accident

LESSONS: • Beware the slippery slope of incremental design expectation compromise: • Time separated and modestly bounded exceptions can integrate to major deviations • Heed the Chris Kraft philosophy – threats from “known unknowns” must be neutralized • Exercise vigilance in program maturity “proclamations” • “Operational” focus probably played unconscious and unintended role • Concerns of competent technical staff should be alarming • Do not conduct Flight Readiness Reviews via telecon

103 '        Missed Advanced Warnings

Case Event Series of vehicle design changes led to an unrecognized reduction in launch Atlas Centaur Launch Availability availability and increased frequency of launch aborts due to upper air winds Failure of disconnect lanyard swivel led to loss of Atlas control during Atlas Centaur 33 booster separation event Ignoring repeated reports of burning rubber smells led to nearly catastrophic Disney World Monorail Fire train fire Titan IV B-32/Milstar & Error in flight program constant caused loss of Centaur attitude control and Atlas Centaur 45 delivery of Milstar to useless

104 '        Atlas Centaur A/C-33

Booster Section

Sustainer Section

Atlas – Stage and a Half Configuration

• Underlying Issue: An improper part known for repeated production failures continued to be used and caused loss of a mission • Problem: Vehicle loss of control on ascent (2/20/1975) • Impact: Loss of Intelsat IV mission • Why: Atlas booster staging disconnect failed to separate – Disassembly of swivel in disconnect lanyard (highly likely)

Source: Atlas/Centaur A/C-33 Failure Investigation and Flight Report, Lewis Research Center, December, 1975 (NASA GRC Archives) 105 '        Atlas Centaur A/C-33 Staging Disconnect – Lots of “Stuff” Has to Separate Cleanly

Atlas Booster – Sustainer Staging Disconnect B600P/J 12 106 '        Atlas Centaur A/C-33 Staging Disconnect (cont’d)

107 '        Atlas Centaur A/C-33 Staging Disconnect Lanyard Swivel

108 '        Atlas Centaur A/C-33 - Observations

• The quality control systems were indicating swivel failures for nearly eight years, from as early as 1967! − Several instances of the swivel’s separating into two pieces at the mating face • It is incomprehensible that effective action was not taken to correct the serious problems with this system and its components − The lack of follow-up and urgency suggests that the personnel involved did not understand the disastrous flight consequences that could and did occur when the system malfunctions − This was truly an accident waiting to happen!

109 '        Atlas Centaur AC-33 – History of Swivel Problems (~8 Years)*

• Atlas 5002 swivel separated at mating face (1967) • Amphenol: “Swivel not intended for aircraft or missiles – was for use in commercial fishing industry – strongly recommend redesign to a more reliable design” (1967) • Redesign not accomplished; additional inspections ordered • Interim fix imposed on 4 E/F (weapon system) vehicles (1967) • Limited survey found defective swivel on Atlas Space Launch Vehicle (SLV) 5902 (1967) • Vehicle SLV 5501 found to have defective swivel (1967) • Urgent Engineering Change Proposal (ECP) for final fix disapproved – “cannot justify expenditures at this time” (1967) • ECP approved for final fix for E/F vehicles not SLV’s (1968) • AC-25 tiger team found defective swivel (12/1970) • Specification Control Drawing revised - swivel replaced with shackle (2/1972) − Not-mandatory − Swivels still used interchangeably • Design change to shackle released 1973 − Not-mandatory − Swivel acceptable alternate until stock depletion • Defective swivel found on Atlas SLV-3A (12/72) • Launch Vehicle Reliability Board directed swivel inspection by Tiger Team (2/1972) − No record can be found implementing that direction

* From AC-33 Failure Investigation Report 110 '        The Engineering Challenge of Electrical Disconnects - Video

111 '        Atlas Centaur A/C-33 (concluded)

LESSONS: • Routinely check up on the continued reliability of systems needed to flag and correct flight critical part quality problems • Was not done at General Dynamics (GD) and NASA • Resulted in this completely preventable loss • Making sure those resolving discrepancies (including shop floor personnel) understand how the systems work and the flight implications of a part failure • And where single points of failure exist • Adopt an over-arching principle: enhanced reliability is required in flight critical mechanisms (margins, redundancy - if it improves reliability, etc.) • Ensure extra quality attention where there are unavoidable single points of failure (e.g. redundant inspections)

112 '        Managing Warning Signs

LESSON: • A large % of mishaps offer advance warnings • Often unrecognized • Efforts to resolve are too often ineffective • Our culture and systems need to recognize this • A culture that encourages and rewards the voicing of concerns • Systems that make it easy to get concerns heard and impose a time- out until resolution • A culture that is characterized by continual probing and alertness for anything that’s not right • Successful, high-performing organizations are obsessed with the prospect of failure!

113 '        Perils of Heritage Systems

Case Event Partially closed Atlas fuel pre-valve at liftoff caused booster engine shutdown Atlas Centaur 5 twelve feet off pad – vehicle and pad destroyed Vehicle horizontal velocity value caused processing error shutting down both Ariane 501 inertial reference systems during first stage – vehicle destroyed Unworkable attitude control system safe hold system design caused loss of Lewis spacecraft control, battery depletion, and loss of mission

Man prefers to believe that which he prefers to be true. Francis Bacon

114 '        Atlas Centaur AC-5

Liftoff Apogee Pad Impact

• Underlying Issue: Reliance on a deficient heritage design • Problem: Atlas engine shutdown twelve feet off the launcher – vehicle and pad destroyed (3/2/1965) • Impact: R&D Flight; launch pad heavily damaged

Source: Subject Matter Experts: (Karl Kachigan, John Silverstein) 115 '        Atlas Centaur AC-5 (cont’d) - Video

116 '        AC-5 Booster Low Pressure Fuel Duct

117 '        Atlas Centaur AC-5 (cont’d) - Video

118 '        Atlas Centaur AC-5 (concluded)

• Why: Booster engine fuel pre-valve not fully open – Flow through partially open pre-valve will completely close it – Position switch design unreliable indicator of valve position

LESSONS: • When considering use of heritage systems, consider the “ilities” environment they were developed in (e.g. reliability requirements)

119 '        Summary of Causes for the Past Mission Failures

Terminology (based, in part, on NASA NPR 8621)

Proximate Cause: The specific, immediate and direct reason the undesired outcome occurred – without this there would have been no undesired outcome.

Root Cause: An event or condition that is an organizational factor that existed before the intermediate cause and directly resulted in its occurrence (thus indirectly it caused or contributed to the proximate cause and subsequent undesired outcome) and, if eliminated or modified, would have prevented the intermediate cause from occurring, and the undesired outcome. Typically, multiple root causes contribute to an undesired outcome.

120 '        Causation Analysis – Breakdown by Category

Distribution of Proximate Causes Distribution of Root Causes

Design Sys Eng Prod/Ops Pgm Mgt Design Sys Eng Prod/Ops Pgm Mgt 0% Pgm Mgt 5% Pgm Mgt Prod/Ops 32% 36% Sys Engr Design 54% Design63% Prod/Ops 10% Design

26 26 Design Proximate Causes Nature of Deficiencies

10 10 77

1

Dev Test Analysis Qual Test Sim Heritage Engineering 121 '        Observations

• Only one of the 43 cases analyzed (Atlas Centaur 24) experienced what was likely a random part failure as the cause of the mission loss!

– Indicates that programs are doing a good job of acceptance testing but with challenges (e.g. counterfeit parts)

• The other 42 were associated some form of human error: management weaknesses, systems engineering shortcomings, testing deficiencies, missed advanced warnings, etc.

• What is the implication of this in terms of reliability predictions?

Facts are stubborn things, but statistics are pliable. Mark Twain 122 '        Reliability Assessments: Loss of Crew (LOC)/Loss of Mission (LOM) Probability Estimates

• Traditional Probalistic Risk Assessments (PRA’s) were historically optimistic, and not appropriate for developmental systems. For example: – They did not account for immature failures during development – They were unable to model human errors • Modern techniques for reliability calculation have improved, and are very valuable to: – Discriminate among competitive conceptual designs – Forecast the potential performance of a new design – Track performance through development to ensure it’s staying on track – Gain insight into risk contributors, and, therefore, guidance for appropriate testing – Compare the relative reliability of various systems 123 '        Reliability Assessments: Loss of Crew (LOC)/Loss of Mission (LOM) Probability Estimates (concluded)

• These techniques produced credible Shuttle

reliability estimates in the mid 1990’s

• However, flight failures are relatively infrequent

– Few space systems will fly often enough to historically validate

LOC/LOM calculations

• So, what can we do to ensure the safest, most

reliable systems?

124 '        How to Get Low (True vs. Analytical) LOC/LOM Probabilities

1.Get the right design requirements, design it right, and prove it • Maximum practical redundancy, robust margins, careful qualification, etc. • Thorough Systems Engineering 2.Make sure you build it like you designed it, every time • Rigorous manufacturing and production controls 3.Test to the greatest extent possible • Ground test everything that can be meaningfully and practically tested • Flight test to the maximum extent feasible 4.Instrument the test vehicles (ground and flight) thoroughly 5.Carefully review and understand all flight data, even if the flight is successful • Every measurement, every trace, every blip 6.Tightly control changes

125 '        How to Get Low (True vs. Analytical) LOC/LOM Probabilities (concluded)

If these steps are taken, then the real LOC/LOM

probabilities will be as low as humanly possible,

but may never be known.

126 '        The Human Element (Observations Drawn From The Presenters’ Collective Experience)

There are basically two types of people. People who accomplish things and people who claim to have accomplished things. The first group is less crowded. Mark Twain 127 '        The Indispensable Human Element

• Good systems and processes alone are insufficient conditions for success • A non-technical lesson learned - difficult programs that succeed are led by individuals who are: – Remarkably accomplished (proven track record) and able to: • Design, energize, and maintain discipline within the systems/processes • Pick a winning team and lead it the distance

128 '        The Indispensable Human Element (cont’d): Some Characteristics Such Individuals Share • Personal traits: – Exceptional technical interest and insight • Won’t compromise technical standards, but practical – Total ownership • Relentless and demanding in pursuing the program’s goal – Cares about and “takes care of” the team - strong loyalty to team – Highly respected by the team • Team members want to gain the leaders’ confidence • Peer pressure for members “to deliver” is high – Treats all parties with respect – Self-confident and humble - unthreatened by the competence of subordinates – Understands the value of humor

A gentleman is someone who knows how to play the banjo and doesn’t. Mark Twain 129 '        The Indispensable Human Element (cont’d): More Characteristics Such Individuals Share • In shaping the workplace: – Insists on unambiguous assignment of responsibility – Fosters culture of openness, integrity, and sharing of information • Contrary views are welcomed from all levels (if you’ve done your homework) – Qualified messengers are never shot – Bad news is just news • There are no hidden agendas • Deception is not allowed - by anybody • Says so when the emperor is naked – Praises in public and counsels in private – Poorly performing team members evoke prompt intervention – Values and practices teaching and mentoring

130 '        The Indispensable Human Element (concluded) Such individuals are not just hypothetical Some examples (not in any order): – Abe Silverstein − Dave Gabriel – Joe Purcell – Max Faget − Tom Young – Dan Sarokon – Rocco Petrone − Norm Augustine – Bob Parks –Jim Martin − Gene Kranz – Jim Beggs – Aaron Cohen − Ed Cortright – Dick Schwartz – Glynn Lunney − Donna Shirley – Joan Shirley – Carolyn Griner − Tom O’Malley – Bob Grey – Dick Kohrs − George Page – Arnie Aldrich – George Mueller − Bill Schindler – Bill Taylor – George Low − Charlie Hall – George Jeffs – John Yardley − John Casani – Bruce Lundin – Werner Von Braun − Dick Truly – John Gossett –Chris Kraft − Steve Szabo – Grant Hansen – Andy Stofan − Karl Kachigan – John Klineberg – Vern Weyers − Ron Thomas – Bud Schurmeier – Jim Odom − Pete Burr – Chuck Wilson – Jesse Moore − Bryan O’Connor – Jon Busse 131 '        Applying the Lessons: “Rules of Practice”

132 '        Applying the Lessons: A Sample Set of “Rules of Practice”

• Issue: Many lessons learned have common themes. The issue is to systematically infuse this knowledge into programs so they’re not lessons forgotten • One approach: For large and complex programs, impose a Program specific set of overarching “Rules of Practice” that govern how certain things are to be done (i.e. to codify some of the lessons) − Any deviation from these “Rules” would be cause for special attention (risk management) by Program Management − These ad hoc “Rules” would not take the place of existing design standards or similar tools, but rather provide an additional mechanism to flag when special action is warranted

133 '        Applying the Lessons: A Sample Set of Rules of Practice (cont’d)

• Design Review: (Causal in 26 of 43 cases) − The acceptability of new designs will be established through a formal design review process staffed by independent peer practitioners of the designers seeking design approval. The reviewers will constitute a design “jury” to determine if: • The design will perform as required. • The test plan is adequate (development, qualification and acceptance). • Test setups and conditions are appropriately representative of the flight configuration (test like you fly). • The test results are successful. • The risk management analysis and mitigation plan are sound. • The in-flight performance is successful.

134 '        Applying the Lessons: A Sample Set of “Rules of Practice” (cont’d)

• Testing Program Definition: (WIRE, AC-62, STS-51/TOS, TC-1, AC-24, , Hubble Space Telescope, Ariane 501, Titan IVB-32, Mars Climate Orbiter, Genesis) − As a core principle, the flight worthiness of system designs (hardware and software) must be validated through ground testing unless such testing is clearly infeasible – the prevailing rule is that if “it” can be meaningfully tested on the ground, it will be. − The following rules apply: • Component, subsystem and system testing will be carried to the highest level of assembly feasible under expected flight environments plus appropriate margins. • Designs will permit functional testing as close to launch as feasible. • Tests will demonstrate compliance with functional design requirements, vs. verifying “built-to-print”. • Any flight hardware simulators (e.g. pyrotechnic simulators) must have a formal design review to ensure appropriate similitude. • Waivers require enhanced margins, redundancy, and robustness of the test program for assemblies making up the design.

135 '        Applying the Lessons: A Sample Set of “Rules of Practice” (cont’d)

• Mechanisms: (Skylab, AC-21, AC-33, Galileo, STS51/TOS, Mars Polar Lander, Genesis) − Collections of components, assemblies and subsystems that must perform an in-flight separation, deployment or articulation will be designated a “system” and be placed under the cognizance of a lead engineer who will be responsible for all aspects of its design, development, production, test and in-flight performance. • These systems will incorporate a redundant separation, deployment or articulation capability and, • Will be qualified for flight through functional testing under the appropriate environments. • Critical Materials and Processes: (AC-43 and Galileo) − Materials and Process vulnerabilities will be identified during design − A plan to address these will be developed to include such measures as destructive examination of “witness” hardware, periodic destructive analysis of parts, plant audits, etc.

136 '        Applying the Lessons: A Sample Set of “Rules of Practice” (cont’d)

• Analytical Modeling: (Causal in 12 of 43 ) − All analytical modeling on which designs are based will be test- validated and acquired from at least two independent sources. − An independently validated plume heating analysis is required of all systems employing a new propulsion arrangement.

• Heritage Items: (Contributing cause in 12 of 43 cases) − Any item adopted for use based on successful flight performance in another program will be deemed unqualified in the adopting application until a thorough analysis has been performed to confirm that the adopting application is identical (or less demanding) in all relevant features to the prior successful application. − Any deviations must be qualified by test.

137 '        Applying the Lessons: A Sample Set of “Rules of Practice” (cont’d)

• Software: (Causal in 4 of 43 cases: Ariane 501, Titan IVB-32, MCO, MPL) − All software development, testing, and application processes will be controlled by a single formal, and configuration managed Software Management Plan for which a single individual is responsible. • Testing provided for in this plan will specifically include: – Demonstration of proper flight software operation in nominal and off nominal flight simulation functional testing; this will be done with flight hardware to the greatest extent possible. – Formal “qualification” and “acceptance” testing of flight critical software “end items” prior to controlled “release” for use. • The plan will also provide for periodic, independent verification that the original requirements remain valid.

138 '        Applying the Lessons: A Sample Set of “Rules of Practice” (cont’d)

• Advance Warning: (Causal in 17 of 43 cases) − An effective system for facilitating communication between those concerned about a potential safety-of-flight problem and those in a position to reconcile it is to be designed and embedded in the Program culture (easier said than done - but surely it’s doable!). It must be: • Formal and visible. • Reliable (if not foolproof). • Simple to use with quick feedback. • Plugged into real authority to stop the action. • Culturally valued and respected.

139 '        Applying the Lessons: A Sample Set of “Rules of Practice” (cont’d) • General Engineering Management Practices: Certain practices will constitute required standard operating procedures: − Rationale Documentation: It will be mandatory to systematically record the rationale associated with all engineering products such as design and operational requirements, procedures, test parameters, processes, design choices, specifications, etc., and to place the rationale as close to the item it relates to as possible. − Assumptions: All assumptions that form the foundation for engineering activities (analyses, test or not-to-test decisions, trade studies, design approaches, etc.) will be explicitly stated and documented. A process for validating, and periodically revalidating, the assumptions will be initiated. 140 '        Applying the Lessons: A Sample Set of “Rules of Practice” (concluded) − Sanity Checking: It will be a customary practice to perform “sanity or reasonableness checks” to gain confidence that complex or abstruse operations have been done correctly (e.g. checking the physics with first order calculations to show momentum is conserved; checking pointing algorithm results using manual modeling techniques; verifying utilization of consistent units of measure). − Review Panels: When assembling panels for engineering activities such as design reviews, trade study critiques etc., it will be customary to consciously carry out a process that ensures that all relevant disciplines are represented. • Etc. (This is a sampling – not an all inclusive list. Certainly, Project specific “Rules” are also appropriate.) 141 '        The Message

• Some may say that the foregoing rules are rather boring • Nothing earthshaking - all pretty routine But that’s exactly the point!

• Rigorous implementation and infusion of quality into all aspects of routine, common sense practices will prevent most mission failures • It’s really not rocket science!

142 '        Conclusions

143 '        Conclusion: Frequent Practice

144 '        Conclusions – Stuff Happens • Most mishaps can be broadly attributed to human error, not rocket science – Missing design or procedural errors – Weak testing practices – Systems engineering shortcomings – Flawed understanding of how software fails – Loss of process discipline – Team complacency – Normalizing deviations – Diminished alertness for warning signs – Improper use of “heritage” systems – Imperfect management – Information flow breakdowns • Often, a complex, subtle, sequence of events is needed – If just one event in the chain were prevented, the failure would not have happened

• Must ensure quality in all the above areas • Essential for mission success • Over decades, the same root causes of failures appear repeatedly • There are few new ones!

145 '        Conclusions – About Learning From Past Incidents

• Sometimes we do learn the lessons, but the process is haphazard • Those involved learn what to do and/or what not to do – But eventually they disappear taking with them: • The nuances of causation • Factors omitted from the official record • The lessons themselves (often) and their underlying rationale – Mishap Reports and Lessons Learned Data Bases (which have come a long way) are what’s left but: • Relevant information may be missing • They lack the live element (the passion) and, • Nothing beats talking to those who “were there”

146 '        Conclusions (cont’d)

• Basically, there is no universally successful approach to learning the lessons from the past • What’s needed is a dependable process that: – Uncovers root causation from those involved and/or the documentation – Develops and promulgates “Rules of Practice” as countermeasures • Organizations desiring to profit from applying lessons previously learned should develop their own tailored approaches – Should be included in the Project Plan

In the end, lessons are still best learned as a “contact sport”

147 '        Conclusions (concluded) • AEA has a special interest in this subject and may be able to help by: – Suggesting an appropriate set of “Rules of Practice” – Presenting broad treatment of case histories (e.g. this type of briefing) – Arranging seminars with the “keepers” (aka greybeards) – Providing mentors on an on-going basis for specific needs – Conducting incident-based training courses – Holding independent project reviews – Assisting project managers – Coaching technical and project management – Organizing customized group & individual programs

148 '        Appendix A: Presentation History 2007 2009 Location Date Location Date NASA Glenn Research Center 1/26/07 NASA Johnson Space Center 1/12-13/09 NASA Glenn Research Center 1/31/07 NASA Glenn Research Center (At OAI) 6/16-17/09 NASA Glenn Research Center (At Plumbrook Station) 3/30/07 NASA Kennedy Space Center (APPEL) 8/3-4/09 NASA Glenn Research Center 5/9/07 NASA Marshall Space Flight Center (APPEL) 8/25-26/09 NASA Glenn Research Center 5/23/07 NASA Ames Research Center (APPEL) 9/21-22/09 NASA Glenn Research Center 7/27/07 NASA Johnson Space Center 10/6-7/09 NASA Langley Research Center (At NIAC) 8/6-7/07 NASA Langley Research Center (APPEL) 12/2-3/09 NASA Glenn Research Center 8/20/07 NASA Glenn Research Center 11/29-30/07 2010 2008 Location Date NASA Johnson Space Center 1/14-15/10 Location Date NASA Marshall Space Flight Center –ET Program 1/28-29/10 NASA Langley Research Center (Ares 1-X Team) 4/16-17/08 NASA Headquarters IPAO (At LaRC) 4/6-7,10 NASA Glenn Research Center 4/25-26/08 NASA Marshall Space Flight Center (APPEL) 5/12-13/10 NASA Glenn Research Center (Ares 1-X Team) 4/29-30/08 NASA Johnson Space Center (APPEL) 6/9-10/10 NASA Marshall Space Flight Center (Ares 1-X Team) 5/06-07/08 NASA Goddard Space Flight Center 6/15-16/10 NASA Johnson Space Center (At Grumman Bethpage) 5/21/08 NASA Kennedy Space Center (APPEL) 6/30-7/1/10 Alliant Technologies (ATK – Utah) 7/08-09/08 Ball Aerospace 7/22/10 NASA Kennedy Space Center (Ares 1-X) 9/9-10/08 NASA Headquarters OSMA (APPEL) 8/10-11/10 NASA Johnson Space Center 12/2-3/08 Paragon Space Development Corp. (APPEL) 10/19/10 NASA Johnson Space Center (APPEL) 10/26-27/10 (ULA) (APPEL) 11/4/10 LLC (APPEL) 11/10/10 NASA Engineering & Safety Center (APPEL) 12/7-8/10

'        149 Presentation History (continued)

2011 2012 Location Date Location Date NASA Goddard Space Flight Center (APPEL) 1/11-12/11 European Space Agency (ESA) Noordwijk, Netherlands 1/24-25/12 The Scientific and Technological Research Council of Sierra Nevada Corporation (APPEL) 1/18/11 3/5-6/12 Boeing Company (APPEL) 1/20/11 Turkey, Ankara, Turkey Anadolu University, Eskişehir, Turkey 3/8-9/12 SpaceX Corporation (APPEL) 2/16/2011 NASA Langley Research Center 3/29-30/12 Orbital Sciences Corporation (APPEL) 2/23/2011 NASA Kennedy Space Center – CCP Office 4/5/12 NASA Safety Center - IV&V Facility (APPEL) 4/4-5/2011 NASA Kennedy Space Center – Rocket University 4/6/12 NASA Kennedy Space Center (APPEL) 6/8-9/11 European Space Agency - Arianespace, Paris 6/26-27/12 NASA Dryden Flight Research Center 6/16/11 NASA Johnson Space Center 8/1-2/12 European Space Agency (ESA) Noordwijk, Netherlands 6/28-29/11 NASA Glenn Research Center 7/18-19/2011 NASA Marshall Space Flight Center – Chief Engineer 9/25-26/12 NASA Engineering and Safety Center 12/14/2011 NASA Marshall Space Flight Center - SLS Program 9/27/12 European Space Agency (ESA) Noordwijk, Netherlands 10/30-31/12 Jet Propulsion Laboratory (JPL) 11/28-29/12

150 '        Presentation History (concluded)

2014 Location Date NASA Kennedy Space Center (KSC), Launch Services Feb 5-6 2013 Program - Flight Analysis Division NASA Marshall Space Flight Center Feb 10-11 Location Date NASA Marshall Space Flight Center Feb 13-14 NASA Headquarters S&MA 2/6-7/13 NASA Wallops Flight Facility Feb 18-19 NASA Marshall Space Flight Center (Engineering) 2/26-27/13 DSO National Laboratories, Singapore (Seminar) Mar 11

NASA Wallops Flight Facility 3/26-27/13 DSO National Laboratories, Singapore (Class) Mar 12-13 European Space Agency (ESA) Noordwijk, Netherlands 5/2-3/13 European Space Agency (ESA) Noordwijk, Netherlands April 14-15 Turkish Aerospace Industries (TAI) Ankara, Turkey 5/23-24/13 NASA Johnson Space Center ( Project) 6/4-5/13 NASA Kennedy Space Center, Launch Services Program April 24-25 NASA Dryden Flight Research Center 9/17-18/13 NASA Glenn Research Center (GRC) May 5-6 European Space Astronomical Center (ESAC) Madrid, Spain 9/25-26/13 NASA Johnson Space Center (JSC) June 3-4 Disneyland, Los Angeles 12/11-12/13 NASA Stennis Space Center Aug 27-28

NASA Armstrong Flight Research Center (AFRC) Sept 10-11

NASA Glenn Research Center Sept 25-26

DNV-GL Singapore Nov 12-13

DNV-GL Singapore (Seminar) Nov 14

DNV-GL Singapore Nov 17-18

151 '        Appendix B: Glossary of Terms

Term Definition Term Definition Term Definition

ACS Attitude Control System CONOPS Concept of Operations HST Hubble Space Telescope

Advanced Communications Defense Meteorological Satellite ACTS DMSP I&T Integration and Test Technology Satellite Program

AEA Aerospace Engineering Associates ECS Environmental Control System IC Integrated Circuit

Automatic Determination and ADDJUST Dissemination of Just Updated ESR Emergency Sun Reacquisition (SOHO) IIP Instantaneous Impact Point Steering Terms

Independent Program Assessment Al-Li Aluminum Lithium (light weight ET) ET External Tank IPAO Office

AOA Angle of Attack FCS Flight Control System IRU Inertial Reference Unit

APL Applied Physics Laboratory FOD Foreign Object Debris ISA Initial Sun Acquisition (SOHO)

APU Auxiliary Power Unit FRR Flight Readiness Review ISS International Space Station

ASRM Advanced Solid Rocket Motor GAO Government Accountability Office ISSP International Space Station Program

Independent Verification and ATK GD General Dynamics IV&V Validation

AV AeroVironment, Inc. GPS Global Positioning System JPL Jet Propulsion Laboratory

BFC Better Faster Cheaper GN&C Guidance Navigation and Control JSC Johnson Space Center

CAIB Columbia Accident Investigation Board HGA High Gain Antenna KSC Kennedy Space Center

152 '        Appendix B: Glossary of Terms (cont’d)

Term Definition Term Definition Term Definition

LCCE Life Cycle Cost Estimate MRB Material Review Board PRA Probabilistic Risk Assessment

LGA Low Gain Antenna MS Meteoroid Shield (Skylab) PTU Port Transducer Unit (B2A)

Reinforced Carbon Carbon (Orbiter LM Lockheed Martin MSFC Marshall Space Flight Center RCC wing TPS)

LOX Liquid Oxygen NAC NASA Advisory Council RLV Reusable Launch Vehicle

National Aeronautics and Space LMA Lockheed Martin Astronautics NASA RNC Reflective Null Corrector (Hubble) Administration

National Oceanic & Atmospheric LSP Launch Service Provider NOAA RSRM Redesigned Solid Rocket Motor Administration

Microgravity Droplet Combustion MDCA NRA NASA Research Announcement S&MA Safety and Mission Assurance Apparatus

MES Main Engine Start (Centaur) NTO Nitrogen Tetroxide (N2O4) S/C Spacecraft

Science Applications International MGS Mars Global Surveyor OSP Orbital Space Plane SAIC Company

MMH Monomethylhydrazine OTA Optical Telescope Assembly (Hubble) SCP Spacecraft Control Processor (MGS)

MMT Mission Management Team (Shuttle) P&W Pratt and Whitney SDR System Design Review

MO Mars Observer PDT Product Development Team SE Systems Engineering

Longitudinal oscillation (as in POGO MOU Memorandum of Understanding POGO SEB Source Evaluation Board stick – not an acronym)

153 '        Appendix B: Glossary of Terms (concluded)

Term Definition Term Definition Term Definition

Systems Engineering Management SEMP TPA Turbine Pump Assembly Plan

SLI Space Launch Initiative TPS Thermal Protection System

SOA State of the Art TWTA Traveling Wave Tube Amplifier

SOX Solid Oxygen UAV Uncrewed Aerial Vehicle

Space Plasma High Voltage SPHINX USAF Interaction Experiment

SRB Solid Rocket Booster VSE Vision for Space Exploration

SRM Solid Rocket Motor

SRR System Requirements Review

SSME Space Shuttle Main Engine

SSP Space Shuttle Program

SSTO Single Stage to Orbit

STS Space Transportation System

TOS Transfer Orbit Stage

154 '        Appendix C: Case History Information Sources

CASE NOTES* LINKS TO MISHAP REPORTS OR OTHER AUTHORITATIVE SOURCES AA Flight 96 http://www.airdisaster.com/reports/ntsb/AAR73-02.pdf AA Flight 191 http://www.airdisaster.com/reports/ntsb/AAR79-17.pdf Apollo 1 http://www.hq.nasa.gov/office/pao/History/Apollo204/content.html Apollo 13 Explosion http://history.nasa.gov/ap13rb/ap13index.htm Apollo 13 POGO http://www.nasa.gov/offices/oce/llis/0334.html Ariane 501 http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html Atlas Centaur 21 2 Atlas Centaur 24 1 Atlas Centaur 33 1 Atlas Centaur 43 1 Atlas Centaur 5 1 Atlas Centaur 62 1 Atlas Centaur 67 http://nsc.nasa.gov/SFCS/Index/SortBydate/Descending/Page6 Atlas Centaur F1 2 Atlas Centaur Launch Availability 2 B-2A Bomber "Spirit of Kansas" http://www.acc.af.mil/shared/media/document/AFD-080605-054.pdf Challenger http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/rogers-commission/table-of- Columbia http://www.nasa.gov/columbia/home/CAIB_Vol1.html CONTOUR http://klabs.org/richcontent/Reports/Failure_Reports/contour/contour.pdf DART http://www.nasa.gov/pdf/148072main_DART_mishap_overview.pdf Disneyworld Monorail 2 Galileo http://trs-new.jpl.nasa.gov/dspace/bitstream/2014/32404/1/94-0141.pdf Genesis http://www.nasa.gov/pdf/149414main_Genesis_MIB.pdf

* NOTES: 1. Source is archived at NASA GRC; 2. Source is Subject Matter Expert.

155 '        Appendix C: Case History Information Sources (concluded)

CASE NOTES* LINKS TO MISHAP REPORTS OR OTHER AUTHORITATIVE SOURCES GPS IIR-3 https://listserv.unb.ca/cgi-bin/wa?A2=canspace;jzM6AA;199907311315400300 Helios http://www.nasa.gov/pdf/64317main_helios.pdf Hubble Space Telescope http://www.ssl.berkeley.edu/~mlampton/AllenReportHST.pdf Lewis http://spacese.spacegrant.org/Failure%20Reports/Lewis_MIB_2-98.pdf Mars Observer http://klabs.org/richcontent/Reports/Failure_Reports/mars_observer/mars_observer_12_9 MCO http://klabs.org/richcontent/Reports/MCO_report.pdf MDCA Experiment 1 MGS http://wl.filegenie.com/~aea/MGS_Final_Rpt.pdf MPL http://klabs.org/richcontent/Reports/NASA_Reports/mpl_report_1.pdf NOAA N Prime Satellite http://klabs.org/richcontent/Reports/Failure_Reports/noaa/65776main_noaa_np_mishap.p Russian Launch Vehicle N-1 http://www.videocosmos.com/n1.shtm Seasat http://klabs.org/richcontent/Reports/Failure_Reports/seasat/seasat_full/seasat.html Skylab http://history.nasa.gov/skylabrep/SRcover.htm SOHO http://sohowww.estec.esa.nl/whatsnew/SOHO_final_report.html STS-51 (TOS/ACTS) http://www.nasa.gov/offices/oce/llis/0312.html Titan Centaur 1 1 Titan Centaur TC-6 1 Titan IVB-32 http://sunnyday.mit.edu/accidents/titan_1999_rpt.doc TK Flight 981 http://www.aaib.gov.uk/cms_resources.cfm?file=/8-1976%20TC-JAV.pdf TK Flight 1951 http://www.onderzoeksraad.nl/docs/rapporten/Rapport_TA_ENG_web.pdf WIRE http://klabs.org/richcontent/Reports/wiremishap.htm X-33 2 X-43A http://www.nasa.gov/pdf/47414main_x43A_mishap.pdf

* NOTES: 1. Source is archived at NASA GRC; 2. Source is Subject Matter Expert.

156 '        P. O. Box 40448 Bay Village OH 44140 www.aea-llc.com Joe Nieberding, President Larry Ross, CEO Email: [email protected] Email: [email protected] Cell: 440-503-4758 Cell: 440-227-7240 MISSION S3        56$#S7       8  7 6S9

9  & :  9 

'