Unopened Gifts Lesson Learned from The Big Dig and Turkish Flight 981

Registration Management Committee Auditor Workshop July 18 - 19, 2013

Brian Hughitt NASA Headquarters Office of Safety and Mission Assurance A Russian Proton rocket veers out of control seconds after launch from the Baikonur Cosmodrome in Kazakhstan.

Quiz:

When is the last NASA shuttle flight to the international space station? Quiz (cont)

Who is NASA dependent upon to deliver astronauts to the Space Station? Quiz (cont)

Who will NASA be dependent upon in the future for delivery of and crew to the Space Station? Quiz (cont):

What do Federal Acquisition Regulations say regarding Government Quality Assurance for commercial products & services?

Quiz (cont)

So (given all this) who are NASA’s eyes & ears to ensure effective implementation of contract quality requirements for its human space flight program?

Bumper to bumper 6 – 8 hours per day Logan

The solution to the overhead highway…. Downtown

Bury it underground

South Boston The Big Dig

– 7 1/2 mile corridor – 161 lane miles – 5 miles of tunnel – 6 interchanges – 200 bridges – 16 million cubic yards of dirt – 541,000 truckloads (4,612 miles of trucks lined up end to end) – 15 stadiums filled to the rim with dirt – 3.8 million cubic yards of concrete (enough for a sidewalk from Boston to San Francisco and back 3 times) Accident Synopsis

At 11:00 pm on July 10, 2006 a 1991 Buick passenger car occupied by a 46 year-old male driver and his 38 year-old wife was traveling eastbound in the I-90 connector tunnel in Boston, MA, en route to Logan . As the car approached the end of the connector tunnel, a section of the tunnel’s suspended concrete ceiling (26 tons) detached from the tunnel roof and fell onto the vehicle, crushing its right side. The driver’s wife, occupying the right-front seat, was fatally injured. The driver escaped with minor injuries.

National Transportation Safety Board Accident Report

Proximate Cause Use of an epoxy anchor adhesive with poor creep resistance

Post accident testing revealed that Fast Set epoxy had been supplied and that, while both Fast Set and Standard Set epoxy performed similarly in short term tests, they differed dramatically under long term load.

Creep Epoxy is a polymer and its stiffness is time and temperature dependent. If a load is applied suddenly, the epoxy responds like a hard solid. But if that load is then held constant, the molecules within the polymer may begin to rearrange and slide past one another, causing the epoxy to gradually deform. As the deformation increases, it becomes irreversible. Creep:

The Unknown Known

ASTM D2990-01

Since the properties of viscoelastic materials are dependent on time…an instantaneous test result cannot be expected to show how a material will behave when subjected to stress or deformation for an extended period of time. National Transportation Safety Board Accident Report Contributing Causal Factors

• Design • System Safety • Project Management • Procurement • Supplier Documentation • Quality Assurance • Industry Standards • Governmental Oversight • Governmental Regulations Defenses are never perfect - When Events Lines Up, the Consequences Can Be Devastating

Requirements Hazard Design

Quality Assurance

Manufacturing

Test

Operations

Mishap

Adapted from : James Reason, Managing the Risks of Organizational Accidents, 1997, p. 12 NTSB Accident Report - Contributing Causal Factors - Design

• Design specifications did not incorporate a provision for attaching a suspended ceiling, even though it was known that one would be needed . Consequently, the tunnel had no embedded ceiling supports. • The design consultant repeatedly recommended undercut anchors vs the adhesive anchors ultimately chosen. • In order to save costs/time, a change was made to use heavy precast concrete panels in lieu of custom-engineered laminated lightweight concrete panels.

“The July 10, 2006, accident was a sudden, violent event, but the circumstances leading up to it developed over a period of more than 20 years, beginning with the design of the Ted Williams tunnel in the late 1980s.”

Epoxy Secured Bolts NTSB Accident Report - Contributing Causal Factors - System Safety

• No redundancy- the majority of U.S. tunnels have continuous ceiling panels that extend into the concrete wall. If the hangers fail, the ceiling is self-supported. • Incomplete Failure Modes and Effects Analysis- Creep not identified as a potential failure mode. Consequently, risk mitigation measures were not implemented. Contributing Causal Factors (cont) Industry Standards

ICC AC58:

Either a design safety factor of 5.33 or a 120-day creep test is required for Fast Set epoxy.

“Given that the ability to sustain a load over a period of time is a typical requirement for almost any type of fastener, the Safety Board is concerned that the ICC has allowed creep testing of epoxy adhesives to be optional. A design engineer should be provided with all of the relevant information about a product before it is used in a safety critical application.” Consequently…

- To support product qualification, the supplier provided an Evaluation Report (ER) which included bond strength tables specifying a safety factor of 5.33 for Fast Set epoxy- not the results of creep tests*. - Nothing in the ER tables or footnotes indicated that the Fast Set epoxy should be limited to use with short-term loads regardless of the safety factor employed

* The Safety Board learned during the investigation that Fast Set epoxy had been tested for creep performance in 1995 and 1996 and had failed to meet the standard Contributing Causal Factors (cont)

Product Qualification

No documentation was provided by the supplier which specified which epoxy formulation was supplied, and neither the contractor nor the design consultant questioned which epoxy was used. Both assumed that the epoxy provided by the supplier was suitable.

The Safety Board found fault with the construction contractor and the design consultant for not adequately reviewing the product qualification documentation. Contributing Causal Factors (cont)

“The supplier should have made a clear distinction in all its literature between the relative capabilities of its Standard and Fast Set formulations. It did not do so, even though before the epoxy was provided, the company had conclusive evidence that its Fast Set epoxy was susceptible to creep.”

“The Safety Board concludes that the information provided by the supplier was inadequate and misleading.” The Gift…

On September 9, 1999, a construction contractor employee installing ventilation ductwork over the tunnel ceiling noticed that several of the anchors had begun to pull out.

…and the Smoking Gun

On November 12, 1999, a proof load test was performed on one of the anchors that had shown significant displacement (9/16”). The engineer noted that “the bolt held for a few seconds, then began to pull out with almost no resistance”.

The Supplier’s Response

When the supplier was called to examine the anchor displacements, they seemed surprised that the anchors that had been successfully proof tested only a few months before could be failing. Installation problems (e.g., excessive preload) were postulated as the cause.

No evidence was found that the supplier took any follow-up action after the examination. - No further testing - No further research

“At least some supplier officials were aware that their Fast Set epoxy was subject to creep, but this information was apparently not considered or was not known by the representatives who evaluated the failed anchors. Even if the information about poor creep resistance was not common knowledge, a reasonable amount of research would likely have revealed it. The Safety Board would have expected the supplier of a safety critical component to have been more proactive in determining why its product was failing.” The Builder, Design Agent, and Project Manager Reply

Increased proof load testing The root cause for the hanger displacement was never clearly identified…

…and surveillance monitoring inspections were never implemented

“The project managers apparently accepted at face value the catalog load capacities provided by the supplier, and performed no independent testing to verify that … the anchors would perform similarly in this particular application.”

NTSB Accident Report LbF

20,000 # Calculated design service load (2,600 Lb-Force) 15,000 #

10,000 #

5,000 # 2,600 #

0 Design Post Installation Finite Element Service Load Proof Test Analysis LbF

20,000 # After each bolt was installed, a proof test was conducted at 25% higher 15,000 # than design service load (3,250 Lb.- Force) 10,000 #

5,000 # 3,250 # 2,600 #

0 Design Post Installation Finite Element Service Load Proof Test Analysis LbF

20,000 # Later, after slippage was noted, bolts were proof tested to the maximum allowable load 15,000 # (6,350 Lb-Force)

10,000 #

6,350 #

5,000 # 3,250 # 2,600 #

0 Design Post Installation Finite Element Service Load Proof Test Analysis A finite element analyses LbF determined that the load would be between 2,371 and 2,823 lb force 15,000 #

10,000 #

6,350 #

5,000 # 3,250 # # 2,823 # 2,600 2,371 #

0 Design Post Installation Finite Element Service Load Proof Test Analysis "You’ve noted the key piece of information that is missing. That is the cause of the anchor failure and how the repair procedure will overcome that… We are not trying to hold up construction, we are trying to make a determination that the installation is safe…” Design Manager e-mail concerning response to Deficiency Report

Where is the voice of Quality?

“Glaringly absent from the Deficiency Report is any explanation why the anchors failed and what steps are proposed to ensure that this problem does not reoccur.” Structural Engineer e-mail reply The Gift (part 2)

On December 17, 2001, a quality control inspector submitted a Noncompliance Report which stated:

“Several anchors appear to be pulling away from the concrete. The subject anchors were previously tested to the revised value of 6350 lbs, all of which passed…. Reason for failure is unknown.”

“At this point, it should have been obvious... that the remedy that had been developed in response to the anchor displacement in the HOV tunnel in 1999 had not been effective, as anchors that had passed proof testing at higher values were still displacing. This was another opportunity to …inspect all the installed anchors to determine the extent and, more importantly, the cause of the anchor displacement. Instead, the companies apparently considered the continuing failures as isolated instances and took no action to address the problem in a systemic way.”

NTSB Accident Report

Contributing Causal Factors (cont) Lack of Awareness

“This accident investigation revealed a striking lack of awareness among designers, contractors, managers, and overseers about the nature and performance of polymer adhesives, even as those adhesives were being approved for use an applications where a failure would present an immediate threat to the public.

Even after being presented with evidence of anchor creep, project managers and overseers failed to recognize the inherent weakness in the epoxy adhesive – a weakness that could not be overcome even with the best installation practices or the most rigorous short- term proof testing.” Cognitive Dissonance

• A psychological term describing the uncomfortable tension that may result from having two conflicting thoughts at the same time, or from engaging in behavior that conflicts with one's beliefs, or from experiencing apparently conflicting phenomena.

• In simple terms, it can be the filtering of information that conflicts with what you already believe, in an effort to ignore that information and reinforce your beliefs.

Wikipedia Contributing Causal Factors (cont) Tunnel Inspections

• In November, 2003, the Design Agent published Inspection Manual for Tunnels and Boat Structures. The manual required each ceiling hanger component to be inspected visually or by NDT.

• From the time the tunnel was opened to traffic until the day of the fatal accident, no tunnel inspections were performed.

• Post accident inspection of the suspended ceiling displayed large numbers of anchors that had become displaced (~25%), and that the displacement was so obvious that even a cursory examination would have revealed that structural integrity was threatened. “Investigators asked MTA officials why the inspection manual was not used. The officials stated that the inspection manual was not used because:

(1) a tunnel inspection database needed to be developed, (2) the inspection manual was being reviewed by the FHWA and the MTA (3) MTA personnel needed time to be trained on the manual.”

NTSB Accident Report Tunnel Inspections (cont)

The FHWA National Bridge Inspection Program (NBIP) mandates bridge inspections at least once every two years.

There are no similar mandates for tunnel inspections.

The cost of quality….

Destination Disaster On March 3, 1974, Turkish Airlines Flight 981, on a routine flight from Paris to London, crashed in a dense forest in France, resulting in the loss of all 346 persons aboard. The proximate cause of the accident was determined to be a faulty latch on its aft cargo door.

58 At 11,500 feet, the differential pressure in the cabin caused the door open and be blown off by the air stream.

The large hole suddenly appearing in a pressure hull created an outward acceleration of air so rapid as to resemble a bomb explosion.

59 The explosion destroyed the flooring above the cargo hold, severing the control cables for the rudder, the elevators, and the number two engine.

72 seconds later, flight 981 slammed into a forest floor at 430 knots.

60 What really broke?

• System safety

• Personnel competency

• Quality assurance

• Open communication

• Ethical behavior

61 “The worst accident in the history of was not necessarily the most terrible event that overtook the human race in 1974. There was war in Indochina, famine in sub-Saharan Africa, and disease and violence in a dozen countries. But it was, for all that, something especially horrible- because it was something that need not have happened at all.”

Paul Eddy 62 Destination Disaster The DC-10 was advertised as “the crowning achievement of the DC line, representing an investment of $1.5B and the product of the most expert, the most sophisticated, and the most rigorous industrial system in existence”.

63 Contributing Causal Factors Design Choices

• Door configuration • Routing of cables, hydraulic lines & wire • Floor strength • Latch design • Cockpit indicator light • Vent door design

64 Door Configuration

The traditional approach to eliminate the hazards of door openings is to make each door a “plug”, with the pressure differential making it seat more tightly. Plug doors are heavy, though, and lessen both the passenger and cargo carrying capacity of the plane.

The DC-10 employed an outward opening door design.

65 Routing of Cables, Wires & Hydraulic Lines

With an outward opening cargo door and the possibility of explosive decompression, the routing of cable, wires, and hydraulic lines from the control surfaces to the flight deck becomes vitally important. The traditional route is along the ceiling of the hull, which is largely protected against such explosions.

DC-10 cabling, hydraulic lines and wires were routed through the floor.

66 Floor Strength

McDonnell Douglass ignored repeated recommendations to increase the floor strength so that it could withstand an explosive decompression.

67 Latch Design

“There were multiple complex linkages between the external handle and the locking pin bar which, in aggregate, were far too weak and flexible. Rather than encountering an irresistible force if the locking pins hit the lugs of an unclosed latch, a handler of normal strength could push the handle fully down, thinking that he had thus insured the closing of the door when all he had done was bend the internal bars and rods out of shape.”

68 69 https://dl.dropbox.com/u/1176203/DC-10%20Latch%20Animation.wmv

70 “Unglamorous as the work sounds - and indeed is – the whole business of maintaining human life in the air comes down to thinking and rethinking about curious and fiddlesome problems of this order.”

Paul Eddy Destination Disaster

71 Cockpit Indicator Light

The cockpit indicator light was operated by the handle, not the locking pins, so when the handle was stowed, the cockpit warning system indicated that the door was properly closed.

72 Vent Door

The vent door should be the last line of defense indicating faulty door locking. If the pins are not driven home, the vent is supposed to remain open, preventing pressurization of the fuselage.

The DC-10 vent door closure design was driven straight off the locking handle, therefore providing no check at all on the working of the locking mechanism.

73 “It was, by any sense of safety engineering, a gimcrack piece of design. Yet, because of decisions taken about floor strength and control-cable routes, the safety of every man, woman and child who went aboard the DC-10 was dependent upon the efficacy of the linkages from the moment the plane went into service.”

Paul Eddy Destination Disaster 74 Failure Modes Analysis

In the summer of 1969, Douglas asked Convair to draft a FMEA for the lower cargo door system of the DC-10. Convair produced a document which accurately foresaw the deadly consequences of a cargo-door latch failure. But neither Convair’s draft FMEA, nor anything closely resembling it, was ever shown to the FAA.

75 The Applegate Memorandum:

“The potential for long term Convair liability has been causing me increasing concern for several reasons… the airplane demonstrated an inherent susceptibility to catastrophic failure when exposed to explosive decompression of the cargo compartment in 1970 ground tests … It seems to me inevitable that in the twenty years ahead of us, DC-10 cargo doors will come open and cargo compartments will experience decompression and I would expect this to usually result in the loss of the airplane.”

F.D. Applegate Director of Product Engineering Convair 76 The Bosses Reply:

• Concur • Sympathize • Do nothing

1. It would increase Convair’s liability 2. Douglas was actively making engineering decisions to improve the latch 3. The current design satisfied FAA requirements

Besides, he later explained, there was no point in approaching Douglas with information that was already well-known to them.

77 The Gift:

On May 29, 1970, during ground testing of Ship 1 to prepare it for its upcoming maiden flight, the air conditioning system was being exercised to build up a pressure differential of 4-5 pounds per square inch. Suddenly, the forward lower cargo door blew open causing a large section of the cabin floor to collapse into the hold.

The incident was blamed on human failure.

78 The Gift (Part 2):

On June 12, 1972, American Flight 96 departed Detroit and was climbing through 11,750 feet when the rear cargo door blew out causing an explosive decompression and loss of flight controls. The crew managed to regain control of the plane and return to Detroit.

“The design characteristics of the latching mechanism permitted the door to be apparently closed, when, in fact, the latches were not fully engaged and the latch lockpins were not in place.” NSTB Accident Report 28 February, 1973 79 McDonnell Douglas attributed the incident almost solely to human failure on the part of the baggage handler and not to any failure on the part of its designers and engineers.

80 The Captain Speaks:

After his near escape, the captain of AA 96 made a strong recommendation to McDonnell Douglas that every DC-10 pilot be told in detail of the consequences of an explosive decompression and the techniques to recover from such an event. McDonnell Douglas did no such thing.

81 The NTSB Speaks:

NTSB Accident Report PB-219 370

• The cargo door should be modified to make it physically impossible to for the door to be improperly closed.

• The cabin floor should be modified and strengthened to prevent its collapsing after sudden decompression.

82 The FAA Responds:

The FAA Western Region Office drafted an Airworthiness Directive implementing the NTSB’s recommendations.

If Airworthiness Directives are not complied with, it becomes illegal to fly the airplane.

83 The Midnight Gentlemen’s Agreement:

The president of Douglass persuaded the FAA Administrator that corrective measures could be undertaken as a result of a gentleman’s agreement, thereby not requiring the issuance of an FAA Airworthiness Directive.

84 “ When you have a well–constructed state with a well-framed legal code, to put incompetent officials in charge of administering the code is a waste of good laws, and the whole business degenerates into farce.”

Plato Laws (Book IV)

85 Per the Gentlemen’s Agreement, Douglas issued two service bulletins:

1. Install a peephole and a decal showing diagrammatically what the handler would see if the locking pin was safely home. Issued as a Safety Alert.

2. Install a support plate to hold up the torque tube just inside the handle. Issued as a routine service bulletin.

86 87 88 “Because history is an unrepeatable experiment, we cannot prove that the extra urgency, legal weight, and publicity which go with Airworthiness Directives would necessarily have made the difference. But the crucial point is the determination on the FAA Administrator’s part that the Douglas company itself could be left to handle the matter in its own way.”

Paul Eddy Destination Disaster 89 Subterfuge:

The day after the Windsor incident, the FAA contacted Douglas to find out if there had been previous problems with the cargo doors. The company failed to hand over relevant airline operating reports and stated that there had only been a few “minor problems”. Later, upon the FAA’s insistence, Douglas handed over records showing that during the 10 months of DC-10 service, there had been approximately 100 reports of doors failing to close properly.

90 Meanwhile at the Douglass Plant:

• Orphan plane

• No independent Government oversight

• A lack of urgency

• Lack of personal accountability

• Fraud

91 Lack of urgency and accountability:

Ship 29 emerged from the production line on April 5, 1972. In the period between completion and its eventual sale to Turkish airlines, some 300 routine service changes were made in the DC-10. These were the results of the operating experience of airlines which had the plane in service and, for the most part, were trivial adjustments.

Amid this flood of minutia, there were changes listed in Bulletin 52-37 to honor the “gentleman’s agreement”.

92 Service Bulletin 52-37:

1. The locking pins of the cargo door were to be adjusted for greater .

2. A support plate was to be installed which to prevent the upper torque tube from bending under pressure.

3. The tube carrying the locking pins was to be modified so that its new and longer travel would not make it give a false signal.

Of these three modifications, only the third one was accomplished, and on its own it was meaningless. But the aircraft’s records at the plant said that all three jobs had been done, inspected, and passed as adequate.

93 A Clear Case of Fraud:

Planning Department records clearly show that on July 18, 1972, three inspectors seemingly applied stamps indicating that the support plate had been installed and the lock tube had been modified. These three men were brought forward and examined under oath. It emerged that not one of these three could recall having worked on the cargo door of any DC-10 at any time. Nor could they recall on any occasion whatever on which they had worked together.

Douglas maintained to the end that human error must account for the falsity of the records.

94 “And so, because of the interrelated failures of McDonnell Douglas, Convair, and the FAA, a fundamentally defective airplane continued on its way through the stream of commerce.”

Paul Eddy Destination Disaster

95 Problems in Istanbul:

• Neither the nor a ground engineer checked to ensure that the cargo door was properly closed.

• The man who closed flight 981 doors had not been advised on the meaning, use, or importance of the indicator window.

• The man was Algerian and the placards on the window were in French and English - he could read neither.

A glance in the window would have shown that the latches were not fully stowed.

96 97 Current & Emerging Quality Threats - The Enemy Knocking -

‹ Electrostatic Discharge ‹ Water Soluble Flux ‹ Metal Whiskers ‹ REACH Legislation ‹ Suspect Titanium ‹ Counterfeit Parts ‹ Commercial Space Electrostatic Discharge - Failure Concerns -

• A controlled ESD event was generated to show a failure in process on a film resistor. • Open shutter photography used to capture arc location of the ESD event as it happened. • After the event, SEM photograph shows “cracking” at the tip of the laser kerf and internal arcing damage between the bond pad and laser kerf. Actual ESD Failure

• A hybrid electronic module failed during ground level testing in a Hubble gyro.

• Failure Analysis traced the failure to a single 1500 ohms thin-film resistor damaged by ESD. Resistor Network

• Relaxed ESD requirements at any handling level (manufacturer or distributor) can cause latent failures.

Failure Site 100 Water Soluble Flux - Voiding Concerns -

Use of Water Soluble Flux is rapidly increasing due to cost of cleaning rosin flux in “green” business environment. WSF can result in solder voiding, potentially reducing joint strength.

Known Unknowns

What attributes differentiate problematic voids from no-impact voids? How can solder joints be screened for macro-voids? What are the process parameters to prevent? Water Soluble Flux - Cleanliness Concerns -

• Un-reacted flux constituents can corrode solder & plating

• Ions + Water + Potential difference (V) → Electromigration of metal causing shorts → Conduction through electrolyte → Conduction through formed metallic salts

• Un-reacted WSF is source of water • Multiple sources of ions (“dirty” boards/parts)

Nondestructive cleanliness screening tests not currently available. Must use process-based quality controls. Metal Whiskers Shorts/Foreign Material Concerns

Tin Whiskers on Zinc Whiskers on Variable Air Capacitor Galvanized Steel Pipe

- Increasing use of lead-free materials driven by EU green legislation - Solder joints & materials containing lead do not develop whiskers - There is no single, effective whisker mitigation approach

Tin Whiskers on Non-Conformal Coated Card Rail Lead-Free Solder

New lead-free solders are introduced continuously …

Known Unknowns

– How consistent/reliable are these complex alloys? – Alloys have large grains, so each solder joint is different. – What are the impacts of mixing with other alloys during repair? – Performance is application dependent. Registration, Evaluation, Authorization and Restriction of Chemicals… REACH

REACH is European Union (EU) legislation enacted in 2007 to improve the protection of the environment and human health by better regulation of chemicals. -Based on the “precautionary principle”: harm is presumed until evidence demonstrates otherwise. - All chemicals manufactured or imported into the EU are required to be registered and assessed. - Substances shown to pose an unacceptable risk can be restricted in their manufacture or use. REACH Concerns

REACH legislation will result in significant material unavailability/ substitutions.

Material changes potentially harmful to product performance.

Many Unknowns- Solutions have not yet been fully explored.

Need for rigorous product qualification, notification of change, and testing requirements. Suspect Titanium

Federal investigators reported in 2009 that suppliers were supplying titanium with degraded mechanical properties.

Titanium bar and plate was “sliced” from raw material billets and sold as finished material. Not mechanically strengthened by forging or rolling and heat treatment operations.

Degraded properties included tensile strength, elastic modulus, and crack growth rate. CounterfeitCounterfeit Electronic Electronic Parts Parts

Demonstrated Reliability vs Insight/Oversight