PQM201B Student Book

Case Studies In Quality Manufacturing

Given case study examples of PQM processes with troublesome outcomes, evaluate facets of existing QMS elements. Relate discussed PQM topic areas to each case and identify opportunities to apply them.

• What failures of the system’s quality management system are evident?

• Was the failure (A) Design related, (B) Workmanship related, (C) Material related, or (D) a combination?

• What processes / components were Key to the item? Critical to the item?

• What corrective actions (if any) were incorporated as a result of the situation?

• How might the following PQM topic areas apply: • Integrated Manufacturing Planning • Continuous Process Improvement • Lean Manufacturing

Teams consider their assigned case with these questions in mind. You are encouraged to consider other references and/or information when exploring these cases. Your team will be provided a discussion period after the lecture portion of the Quality Management lesson to share your impressions about your assigned case amongst yourselves. Each team will then present a summary of your findings to your classmates to conclude the lesson.

v16.2 195 PQM201B Student Book

v16.2 196 PQM201B Student Book

The Loss of the USS THRESHER

On April 10, 1963, the nuclear submarine USS THRESHER failed to surface from a test dive and was lost at sea.

On the morning of April 10, the THRESHER proceeded to conduct sea trials about 200 miles off the coast of Cape Cod, MA. At 9:13am, the USS SKYLARK received a signal indicating that the submarine was experiencing “minor difficulties.” Shortly afterward, the SKYLARK received a series of garbled, undecipherable message fragments from the THRESHER. At 9:18am, the SKYLARK’s sonar picked up sounds of the submarine breaking apart. All hands were lost - - 129 lives.

The Investigation

The subsequent investigation of the disaster by the Navy identified a leak in an engine room seawater system as the most probable cause of the tragedy. Further, both the Navy’s investigation and a Congressional inquiry identified several additional probable causes linked to management, communication, and the practices and procedures employed by the Navy and the shipyards.

The THRESHER was the first of a new class of nuclear submarine designed to dive significantly deeper than its predecessors. After nearly a year of record-breaking operations, the submarine underwent a scheduled shipyard overhaul that entailed significant alterations to its hydraulic power plant. Because of Fleet operational requirements and competition for resources with four other submarines under construction in the same shipyard, the overhaul was conducted under tight schedule constraints.

The Navy’s investigation concluded that while the THRESHER was operating at test depth, a leak had developed at a silver-brazed joint in an engine room seawater system,

v16.2 197 PQM201B Student Book

and water from the leak may have short-circuited electrical equipment, causing a reactor shutdown and leaving the submarine without primary and secondary propulsion systems. The submarine was unable to blow its main ballast tanks, and because of the boat’s weight and depth, the power available from the emergency propulsion motor was insufficient to propel the submarine to the surface.

Practices and Procedures

After the investigation, the Navy embarked on an extensive review of practices and procedures in effect during the THRESHER’s overhaul. The reviewers determined that existing standards at the time were not followed throughout the re-fit to ensure safe operation of the submarine. Four issues were of particular concern:

Design and Construction: The submarine was designed and built to meet two sets of standards. Because the submarine’s nuclear power plant was the focus of the engineers, the standards used for the nuclear power plant were more stringent than those for the rest of the submarine. As a result of the emphasis placed on nuclear- related aspects of the design, builders assigned less importance to the steam and saltwater systems, even though those systems were crucial to the operation and safety of the vessel.

Brazing: Two standards for silver-brazing pipe joints were used during the THRESHER’s construction and overhaul. Brazing is a process that joins metal parts by heating them to a temperature sufficient to melt a filler material, which then flows into the space between the closely fitted parts by capillary action. Induction heating, which provides better joint integrity, was used for easily accessible joints. Where accessibility was restricted, hand-held torches were used. Reviewers determined that hand-held torches were used to heat many of the THRESHER’s crucial, but less accessible, pipe joints.

Quality Assurance: A newly accepted nondestructive testing technology for quality assurance was not implemented for the THRESHER’s overhaul. The Navy had experienced a series of failures with silver-brazing, which resulted in several near misses, indicating that the traditional quality assurance method, hydrostatic testing, was inadequate. Therefore, the Navy directed the shipyard to use ultrasonic testing, a method newly accepted by industry, on the THRESHER’s silver-brazed joints. However, the Navy failed to specify the extent of the testing required and did not confirm that the testing program was properly implemented. When ultrasonic testing proved burdensome and time consuming, and when the pressures of the schedule became significant, the shipyard discontinued its use in favor of the traditional method. This action was taken despite the fact that 20 of 145 joints passing hydrostatic testing failed to meet minimum bonding specifications when subjected to ultrasonic testing.

Records and Documentation: It was determined that records were incomplete or non- existent for numerous amounts of work to include critical practices and critical methods.

Procurement: Finally, specifications for Government procurement were not strictly enforced. The Navy found that the reducing valve components installed in the pressurized air systems used to blow the main ballast tanks of the submarine did not meet design specifications. Because of the magnitude of the pressures anticipated, the valve manufacturer had added a strainer feature upstream of the reducing valves to

v16.2 198 PQM201B Student Book

protect the sensitive valves from particulate matter. When the Navy conducted tests on another THRESHER-class vessel, it found that the pressure drop across the component at high flow rates caused entrained moisture to accumulate on the strainers and form enough ice to block the air flow. Venturi cooling, as this phenomenon is called, was thought to be the reason that the THRESHER’s attempts to blow its main ballast tanks were ineffective.

After an extensive underwater search utilizing the bathyscaph TRIESTE, oceanographic ship MIZAR and other ships, THRESHER’s scattered remains were located on the sea floor, some 8400 feet below the surface. Deep sea photography, recovered artifacts and an evaluation of her design and operations permitted a Court of Inquiry to determine that she probably sank due to a piping failure, subsequent loss of power and inability to blow ballast tanks rapidly enough to avoid sinking. Over the next several years, a massive program was undertaken to correct design and construction problems on the Navy’s existing nuclear submarines, and on those under construction and in planning. Following completion of this “SubSafe” effort, the Navy has suffered no further losses of the kind that so tragically ended the THRESHER’s brief service career.

Lessons Learned: (1) Engineering, design and construction must place equal weight on nuclear and non-nuclear systems when the operation of either system can affect the safety or integrity of an overall system. (2) In selecting the standard for which a task is performed, the pressures of time and resources should not override the safe and continued performance of the result. Selecting the easy standard to save time and money increased the probability of a failed weld. (3) Communication of near-miss events by management to various departments, or feedback, helps resolve weaknesses or flaws that in future events could prove tragic. (4) Procurement of equipment and components must be checked upon receipt as well as tested under operating conditions to verify its suitability. Valves or other parts could be assembled with counterfeit bolts, which fail when stressed.

v16.2 199 PQM201B Student Book

The M-16 Rifle and Ammunition System

In 1957, Eugene M. Stoner, a skilled civilian engineer, was commissioned by the to develop a shoulder fired weapon that weighed no more than seven pounds, and that was to be capable of automatic as well as semi-automatic firing. In less than a year, he delivered a prototype of the weapon to the Army at Fort Benning, Georgia where it was given a thorough testing. The Army found the rifle, which was named the AR-15, to be equal to its own M-14 in firing at distances of up to five hundred yards. The AR-15 was found to be superior to the M-14 in respect to weight, ease of automatic firing without climbing, and in the weight of its ammunition, which allowed a soldier to carry more rounds without weight increase. After months of testing, the United States Continental Army Command Board recommended that the AR-15 rifle be adopted to replace the M-1 rifle, of World War II fame, as the Army's standard basic infantry weapon.

The recommendation was not adopted, and it was not until 1962 that 1,000 of the rifles were sent to Vietnam for months of testing in the hands of United States Advisors and Vietnamese soldiers. This was accomplished over the objections of the Department of the Army by the direct intervention of Robert McNamara, the Secretary of Defense. These tests in Vietnam proved to be the publicity needed to persuade the Air Force and the Navy to ask for initial purchases of the weapon, in order to equip their personnel serving in Vietnam. Following the Air Force and Navy requests, Army General Paul Harkins was so impressed with the test results that he placed an order, in the summer of 1962, requesting 20,000 of the rifles for use by United States and Vietnamese troops.

The Army Staff resisted the change and was reluctant to adopt the AR-15 in lieu of the more conventional M-14 rifle. Again, Secretary of Defense McNamara, who was also impressed by the test results showing the AR-15 to be superior to the M-14, intervened and forced the services into a compromise. This compromise resulted in the Army

v16.2 200 PQM201B Student Book

placing an order in 1963 for 85,000 AR-15 rifles to be used by its troops in Vietnam, while keeping the M-14 rifle as the standard weapon for its other troops stationed in the United States and in Europe.

From its first introduction into Vietnam in 1962 until 1966 the rifle, now termed the M-16, enjoyed a reputation of an extremely lethal and dependable weapon among the soldiers using it in combat. In 1966 a jamming malfunction with the M-16 rifle began to become commonplace. This malfunction consisted of the failure of the rifle to extract a fired cartridge shell. The extractor would grip the rim of an expended cartridge and instead of pulling the cartridge from the chamber of the barrel on the rearward movement of the bolt, the extractor would pull a portion of the rim from the cartridge as the bolt moved to the rear, leaving the cartridge in the chamber of the barrel. This then required the soldier to take a cleaning rod and insert it into the muzzle end of the barrel and force the fired cartridge from the chamber, thus clearing the weapon so that it could be fired again. Confidence among combat troops soon reached such a low level that 1/5th Mech. combat troops began arming themselves with whatever other weapons were available. These included rifles, pistols, shotguns, sub-machineguns and whatever else could be scrounged.

Why did a weapon that enjoyed a reputation of reliability in combat suddenly begin to malfunction? Almost as perplexing is the question of why in late 1967, the rifle again began to live up to its old reputation of reliability and the malfunctions ceased.

In May of 1967, after numerous complaints had been received by members of the United States Congress regarding the malfunctioning of the M-16 rifle in Vietnam, a special subcommittee of the Congressional Armed Forces Committee, began to investigate the allegations.

In Vietnam, when the malfunction started to make its appearance and the combat soldier started asking why, he was told that it was his fault because he was not keeping his weapon clean. A further complication at the time was that there were two types of ammunition available. The IMR and the Ball Propellant became mixed as the Ball Propellant was being introduced and the IMR was being used up. Then it was said that the weapon needed a new buffer and that would cure the problem. With the new buffer the malfunction continued, and again the soldier was told it was his fault because he was not properly cleaning his rifle. The Army tried to blame him and the rifle. As it turns out, the blame for the malfunction rested with neither the soldier nor the M-16 rifle. It rested with the manufacture of 5.56 mm. ammunition with ball propellant, because it was cheaper than using IMR extruded propellant, and there was a huge surplus of old artillery powder from which ball propellant was manufactured.

Excerpts from the “M-16 Controversies” by Thomas L. McNaugher ©1984:

Testing the rifle using ammunition loaded with the powder Stoner and Remington Arms had agreed upon in 1957, test personnel discovered a reliability rate of .91 malfunctions per 1000 rounds fired. Switching to the newer powder used to load cartridges since 1963, they encountered 5.60 malfunctions per 1000 rounds fired. Though neither malfunction rate was terribly high, the correlation between powder and malfunctions was unmistakable.

v16.2 201 PQM201B Student Book

Test personnel related the higher malfunction rate to two characteristics of the new propellant. On the one hand, the new powder, called “ball” powder because of the spherical shape of the granules, caused more visible fouling (build up of sooty dirt) in the rifle’s chamber than the Dupont IMR4475 “stick” powder - - so called because its granules took the form of tiny sticks - originally used in the cartridge. On the other hand, ball powder offered different pressure characteristics than did stick powder. It thus tended to drive the rifle’s cyclic rate - the rate at which it reloaded and fired rounds automatically - - over the limits Stoner had originally prescribed for it. There was no way to separate the effects of the increased cyclic rate from those caused by fouling in the chamber. Test personnel could only conclude that the two characteristics of ball powder together “caused, complicated and multiplied” a malfunction syndrome consisting largely of feed and ejection failures plus the failure of the bolt to remain locked to the rear after the last round of each magazine had been fired.

Though the formal test results were not published until the summer of 1966, test personnel telephoned their findings to the Army’s Technical Coordination Committee (TCC) in November 1965. The test activity also checked with Colt’s to see why these malfunctions had not surfaced in the reliability portion of that firm’s factory acceptance tests. They were surprised to discover that Colt’s conducted these tests using the ammunition loaded with the original stick powder, even though most of the ammunition being shipped to troop units – including those entering Vietnam – contained the ball propellant. Colt’s did so with the full concurrence of the TCC.

In October 1966 the TCC sent a high-level team of Army and Colt’s personnel to Vietnam to document this new turn in the rifle’s fortunes. Going directly to the units which had complained about jamming, the team found the rifles in these units in unbelievably shabby condition. One expert’s comment:

“I had never seen equipment in such poor maintenance…on some rifles you could not see daylight through the barrel. The barrels were rusted, and the chambers were rusty and pitted.”

Most of the jamming reports came from units issued the rifle after arriving in Vietnam. Units issued their M16s in the United States and trained with it there before heading into combat never experienced significant problems with the rifle. By contrast, soldiers issued their M16s at virtually the same time they entered combat hardly knew how to clean the piece. One expert’s comment says “Many of them said they had never been taught the maintenance of the rifle.” In addition, though sufficient cleaning rods and patches were in the theater, they had not been distributed to the individual level.

Though principle blame for the rifle’s problems was assigned to the powder switch, [the DoD] could hardly ignore the role maintenance failures had played. “Proper care and cleaning [were clearly] of utmost importance to the effective operation of the rifle.” The Committee found that “shortages of cleaning equipment, lack of proper training and instructions contributed to the excessive malfunction rate of the M16 rifle in Vietnam.”

v16.2 202 PQM201B Student Book

v16.2 203 PQM201B Student Book

Steam Valve failure aboard USS IWO JIMA

Anatomy of a Catastrophic Boiler Accident By David G. Peterson Loss Control Inspector Indiana Insurance Company

©2002, The National Board of Boiler and Pressure Vessel Inspectors. All rights reserved.

In October 1990, as the United States prepared for war with Iraq, the majority of our U.S. Naval fleet was in the Persian Gulf. Among the ships was the USS IWO JIMA. The IWO JIMA, an built in the mid 1950s, is an aircraft carrier that carries helicopters, in addition to a large number of marine ground force troops.

Routine Repairs Turn Tragic

By mid October, the IWO JIMA had been operating in the Persian Gulf for approximately two months and had developed some leaks and other repair needs in the ship's 600 psi steam propulsion plant.

The ship was granted permission to dock in , (a country on a group of islands in the Persian Gulf, between Qatar and Saudi Arabia) and conduct repairs. A variety of maintenance items were planned, including overhauling the main steam valve that supplies steam to one of the ship's turbine-driven electrical generators. This valve, incidentally, could also be considered a boiler-boundary stop valve. The overhaul of this large, rising-stem, bolted-bonnet gate valve was contracted to a

v16.2 204 PQM201B Student Book

local ship repair company, under the supervision of U.S. Government inspectors. Towards the end of October all repairs had been completed and the ship was ready to rejoin the fleet. Following a couple days of steaming in-port and equipment testing, an early morning light-off of their second boiler was planned in preparation for an 8:00 a.m. underway time.

All seemed to be going well, and shortly before getting underway the engineers started their second turbine generator as a normal part of the ship's restrictive maneuvering policy. (Note: when a ship is leaving or entering port, or in an otherwise hazardous navigation situation, additional equipment is operated and additional control is assumed by the ship's bridge.) However, because of its location, this stand-by generator was rarely operated, thus the testing of its steam supply valve may have been overlooked. Consequently, very shortly after getting underway, and while the ship was still in the harbor, the recently overhauled valve began to leak.

Within moments the valve was leaking badly. The engineers reported their concern to the ship's captain on the bridge.

Before the captain could stop and anchor the ship, the bonnet completely blew off the valve, dumping the steam from two large power boilers into the boiler room. All 10 of the engineers in the boiler room were killed. A few somehow managed to escape the steam-filled boiler room, only to suffer for a few additional hours before finally succumbing. Amazingly, the ship managed to drop its anchor and safely stop. Later that day, tugboats pushed the IWO JIMA back into port.

The Role of Inspector: A Life-Changing Event

At the time of the accident, I was assigned as boiler inspector of the Commander of Amphibious Squadron Twelve. The IWO JIMA was one the ships that made up our squadron.

Called upon to help oversee the enormous repair efforts that would be necessary, I was part of a team which helped train the new, all-volunteer crew to replace those who had perished. Witnessing this physical devastation and how emotionally affected the surviving crew members were made this assignment, by far, the most difficult of my naval career.

It's virtually impossible to describe the scene upon entering the boiler room for the first time. It was a very eerie white color. This occurred because the velocity of the steam had sandblasted the insulation off of nearby pipes, and evenly spread the insulation onto every surface, and into every crack or crevice virtually throughout the boiler room. Glancing around the room from corner to corner, it stimulated a very scary, weary feeling in me and others, as we could picture the terror the operators might have felt as they desperately tried to shut off the boilers and escape the heat.

Reaching beyond emotional factors, we knew the clean-up alone was to be a mammoth undertaking. There was no question we would have to disassemble and clean almost every piece of major machinery in the boiler room. However, the more we thought about the incident, the more difficult it was to believe that something this devastating could occur. There was no doubt that aboard that ship there were qualified contractors, inspectors, and operating personnel.

v16.2 205 PQM201B Student Book

Speculating the Cause of the Incident

Ultimately, the cause of this catastrophic accident was the installation of nuts made of an improper material for the job. These nuts were used on the bolts fastening the bonnet to the valve's body.

When the second turbine generator was started, this valve was opened to allow for steam at 600 psi and 850 degrees to pass. As this valve got hot, the nuts were expanding at a much greater rate than the bolts, and they lost the strength to contain the steam's force under the bonnet.

This tragic accident should not have happened. The job specifications required the contractor to provide and verify all materials. Additionally, hold-point inspections were to be conducted by government inspectors and the ship's engineers. Obviously, something went very wrong.

In our society when something goes wrong it's usually followed by an investigation to determine what went wrong. At the center of the investigation concerning those directly involved were: the mechanic who performed the repairs, for not following his work specifications; the authorized inspection agency, for not conducting hold-point inspections; and the boiler engineer in charge of the boiler room, for not conducting inspections.

Beyond this, it should also be mentioned that the initial charges being considered were involuntary manslaughter. However, no willful neglect was ever found, and these charges were never filed. But this does show how serious the ramifications can become. Ultimately, what was found was that a series of mistakes and misunderstandings led to this accident.

First, the mechanic wanted to replace the fasteners, but he did not have any. He also did not speak English very well. Allegedly, the mechanic asked one of the boiler room personnel for new nuts and bolts, and was given permission to look through the boiler room's spare parts bins. He selected parts that he thought would work.

Secondly, with the case of the authorized inspection agency, it must be taken into consideration that there was a massive build-up of troops and ships in the Persian Gulf. It was assumed that the office was not staffed for the increased workload. Consequently, many inspections were not being conducted.

And finally, the boiler room supervisor thought that one of his subordinates had conducted an inspection. Sadly, it was never concluded that an inspection had been made. If someone had conducted an inspection, possibly he was not familiar with fastener markings, nor the job specifications.

The end result was that incorrect fasteners were chosen, and controlled inspections and testing were not fully accomplished.

**************************************************************************************************************

NavShipsTechManual (http://www.fas.org/man/dod-101/sys/ship/nstm/ch075.pdf) comments-- "... Black Oxide Coated Brass Threaded Fasteners. Most of the brass fasteners in the supply system are black oxide coated. This presents a potential for improper installation, particularly in place of

v16.2 206 PQM201B Student Book

steel fasteners which may also be black oxide coated. Not only are the brass fasteners of significantly lower strength, but they decrease rapidly in strength at temperatures over 250°F. In October 1990, black oxide coated brass nuts were incorrectly used to repair a steam valve, resulting in a casualty which killed a number of sailors. As a preventive measure, NSN’s have been established for shiny brass nuts of the sizes of black oxide coated nuts most likely to pose a hazard due to incorrect substitution aboard ships." (c) re: high temps, NSTM ch075.pdf states "Use ASTM A 193 grade B16 alloy steel externally threaded fasteners and ASTM A 194 grade 7 nuts at temperatures up to 1,000°F. If corrosion is a problem, ASTM A 453 grade 660 stainless steel fasteners provide corrosion resistance up to 1,200°F. If coated fasteners are unavoidable in high temperatures, take into account the temperature resistance of the coating. See Table 075-3-1 for temperature limitations on specific fasteners. (c) fastener marking charts (incl. nstm ch 075.pdf) - - http://www.hudsonfasteners.com/fast_guide/fg_grade_markings.htm - http://www.textronfasteningsystems.com/eng_tools_f/grades.html - http://www.americanfastener.com/techref/grade.htm (d) re: counterfeit fastener detection - - www.nsls.bnl.gov/organization/ESH/QA/documents/SCI/ SCIAwarenessTrainingManual20040512.pdf

v16.2 207 PQM201B Student Book

v16.2 208 PQM201B Student Book

The F-22 Raptor Software

Flight Control Software

Embedded software is used in nearly every subsystem of every major weapon system in use in the United States Air Force today. The use of embedded software has increased drastically over the past twenty years. The F- Ill accomplished 20% of its functions using software in its original design, in contrast to the B-2 which accomplishes 80% of its functions with software (Cannan, 1986:49). The importance of embedded computers and embedded software has also increased over the years to the point where mission critical decisions have to be made instantaneously with little room for error. This situation is especially critical for aircraft which rely on embedded computers for the basic performance of not only their weapons and avionics, but for flight control as well. Since many aircraft in the Air Force inventory are designed to be aerodynamically unstable in order to maximize their performance, they can not be operated without the use of "fly-by- wire" flight control systems. The criticality of these embedded computer systems has led to the establishment of rigid acquisition and development requirements.

However, in spite of the regulations currently in existence for the Engineering and Manufacturing Development (EMD) phase, there is an increased difficulty in developing and acquiring software which meets the user requirements within the original estimates of both cost and schedule. This problem is so serious that software often ends up as the pacing factor in the development of major systems. In other cases, software developed during the earlier Demonstration and Validation (Dem/Val) phase does not meet the same standards required as software developed in EMD phase, resulting in other problems.

The system engineering process places little emphasis on software development until the EMD phase. For some major weapons systems such as aircraft, some software

v16.2 209 PQM201B Student Book

modules such as those required by the flight control system must be finished in order to complete the Demonstration and Validation phase. These prototyped software modules were produced without the restrictions placed on the software development process found in the EMD phase. This can lead to problems in the future since the contractor most likely eliminated many of these restrictions in order to minimize cost. It is also unlikely that the contractor will re-complete the work correctly, since the module has already proven to be successful. This lack of restrictions found in the Dem/Val phase can result in poorly documented code, causing it to be unmaintainable. Other problems, such as cost overruns and schedule delays, are similar to those encountered with software developed during the EMD phase.

The software development process is relatively new compared to the hardware development process and is not as mature. Procedures and standards are well established for hardware development, but are highly dynamic in the software area. The Software Engineering Institute (SEI) as well as AFMC are developing methods to evaluate contractors processes for software development. Watts Humphrey, the main architect of the SEI's Capability Maturity Model (CMM), based the development of the CMM on Deming's principle that the product produced will only be as good as the process used to produce it (Humphrey, 1989:3). It is estimated that over 80% of all software organizations have an immature software development process. This lack of process has resulted in fragmentation and non-standardized development environments throughout the defense industry (Hall, 1991:614). This immaturity and fragmentation accounts for a portion of many of the other problems encountered in the software development process.

In the current government procurement standards, "The acquiring agency also has few review points or measurable milestones during the development process" (Simmons, 1990:665). In fact, they are not well proportioned throughout the development ”the major review points in the development phase occur before development is 50% complete, with almost no review milestones occurring until program completion" (Simmons, 1990:665). Even though most of the reviews are early in the program, "generally the most significant or difficult problems occur early in the development cycle but are not found until integration and test have begun" (Simmons, 1990:665). Many times these problems can be traced back to unfeasible or even nonexistent performance requirements, indicating the improper allocation system requirements as well as inadequate requirements analysis.

F-22 Flight Control System

The flight control system of the F-22 prototype in the Dem/Val phase was a Triplex digital flight control system. However, since this system was not developed under the same rigid standards applying to the EMD contract of the F-22, as well as airframe modifications, a new quadruple redundant flight control system is being developed during the EMD phase of development. The process used to develop the flight control system for the F-22 is similar to that used on the B-2 and the F-16. The key difference is that management is focusing on the process used to develop the flight control system on the F-22 instead of the product itself. Automation of many of the procedures like regression testing is another item the F-22 team is accomplishing. The F-22 has also formalized the team relationship that was found in both the F-16 and B-2 flight control areas, but not necessarily in their avionics areas. The F-22 has what appears to be the best process to develop software for flight control systems of the four programs

v16.2 210 PQM201B Student Book

reviewed, and it is believed that this improved process will also provide the best product, but only time will tell.

Significant effort was put forth at the outset of the EMD contract to define a software process to be used across all organizations developing software. This process is documented in the Software Development Plan (SDP) in accordance with MIL-STD 2167A. The evolution of the SDP was well thought out by the program office as well as the prime contractor. At the outset of the EMD contract, the SPO conducted a capability/capacity review, in accordance with ASCP 800-5, of the prime contractor as well as all of the subcontractors that would be contributing a significant amount of software on the contract. This review was used to choose a process model and software development plan which is consistent with the composite maturity of all the software organizations working on the contract. The basic process chosen was the waterfall model with a common methodology for real-time applications known as the Ada Design and Requirements Transformation System (ADARTS). Obviously, Ada was the language chosen for all applications where it was practical to use (approximately 95% of the code is Ada).

The philosophy for maintaining development process discipline is to simply take no shortcuts. Hardware and software should be developed together with an integrated work breakdown structure and integrated master plans which define key system milestones. Thorough testing at each stage of development, strict configuration control, and adherence to the software development plan are key points to maintaining overall process discipline. The only way that these lofty goals can be met is by close interface with the contractor. The F-22 SPO does this by the use of Integrated Process Teams. This Government/contractor teaming arrangement ensures that everyone knows how the process works and management responsibility is assumed by the co-team leaders. Probably the most effective way of ensuring software development process discipline is in the calculation of the six month award fee. The contractor's award fee is partially based on the program office's evaluation of how the contractor is conforming to the SDP. This award fee arrangement is a significant incentive since upper management will surely be concerned with the bottom line.

F-22 SPO personnel were surveyed in 1990 to identify what they felt were the Top 10 software related problems for the aircraft. The only significant cost or schedule impacts identified are for the top two problems. Both of these problems could be the result of the rephasing effort which has caused the availability of fewer resources to the program. Most likely, this reduction in resources has not changed the schedule or cost expectations for the program by Congress.

1. Unrealistic program schedules/budgets/manpower estimates 2. Unrealistic user requirements/performance goals 3. Changes in user requirements 4. Requirements creep/gold plating 5. Excessive memory/throughput requirements 6. Shortfalls in hardware 7. Underestimate of time required for SW testing and debugging 8. Incorrect requirements/specifications 9. Underestimate of time required for SW analysis/design/coding 10. Lack of training/SW awareness of contractor management

v16.2 211 PQM201B Student Book

Software Glitch Forced F-22 Back to Posted on 14 February 2007

HICKAM AIR FORCE BASE, Hawaii (AFNEWS) -- While en route to Kadena Air Base, , Feb. 10, a software issue affecting the F-22 Raptor’s navigation system was discovered.

All aircraft, which departed Hickam AFB earlier that day, returned safely. F-22 engineers and maintainers are working to update the software. After successful testing, the aircraft will continue their planned first overseas deployment to Kadena. Officials expect the aircraft will depart Hickam AFB within the next several days.

"This is a minor issue, and, since our focus is always on safety, the aircraft will not depart until we are confident there are no further issues with the navigation system," said Lt. Gen. Loyd S. Utterback, the 13th Air Force commander.

The Air Force is deploying 12 F-22 Raptors and more than 250 members from the 27th Fighter Squadron at Langley Air Force Base, Va., to Kadena AB as part of a regularly scheduled U.S. Pacific Command rotation of aircraft to the Pacific.

Jets can now cross Pacific, Far East safe for democracy again

By Lewis Page - Published Wednesday 28th February 2007 10:59 GMT

Significant new capabilities have been added to the US Air Force's latest superfighter, the F-22 "Raptor". The USAF's Raptors cost more than $300m each, and are generally thought to be the most advanced combat jets in service worldwide. However, until recently they were unable to cross the international date line owing to a software bug in their navigation systems.

A group of F-22s heading across the Pacific for exercises in Japan earlier this month suffered simultaneous total nav-console crashes as their longitude shifted from 180 degrees West to 180 East.

Luckily, the superjets were accompanied by tanker planes, whose navigation kit was somewhat less bleeding-edge and remained functional. The tanker drivers were able to guide the lost top- guns back to Hawaii and the exercises were postponed.

"Every time we fly this jet we learn something new," Raptor squadron commanding officer Lt-Col Wade Tolliver said.

But enemies of democracy who may have been planning an opportunistic attack on Hawaii followed by a retreat to safety across the date line shouldn't get their hopes up. The software bug has been rectified, and the Raptors have now successfully traveled to Kadena Air Base in Japan, where air-combat exercises are now well underway.

"This is history in the making," said Brigadier Punch Moulton, commanding the Kadena-based 18th Wing. The deployment is expected to last more than three months. ®

v16.2 212 PQM201B Student Book

The loss of Space Shuttle COLUMBIA

COLUMBIA was launched from Kennedy Space Center, Florida on January 16, 2003 on the 113th shuttle mission (STS-107), a scientific research mission lasting 16 days. This was COLUMBIA’s 28th flight. On the morning of February 1, as COLUMBIA was entering the earth’s atmosphere, the Orbiter burned up. The physical cause of the loss of COLUMBIA and its crew was a breach in the Thermal Protection System on the leading edge of the left wing, caused by a piece of insulating foam which separated from the left bipod ramp section of the External Tank at 81.7 seconds after launch, and struck the wing in the vicinity of the lower half of Reinforced Carbon-Carbon panel number 8. During re-entry this breach in the Thermal Protection System allowed superheated air to penetrate through the leading edge insulation and progressively melt the aluminum structure of the left wing, resulting in a weakening of the structure until increasing aerodynamic forces caused loss of control, failure of the wing, and break-up of the Orbiter. This breakup occurred in a flight regime in which, given the current design of the Orbiter, there was no possibility for the crew to survive.

Solid Rocket Booster Bolt Catchers

The fault tree review brought to light a significant problem with the Solid Rocket Booster bolt catchers. Each Solid Rocket Booster is connected to the External Tank by four separation bolts: three at the bottom plus a larger one at the top that weighs approximately 65 pounds. These larger upper (or “forward”) separation bolts (one on each Solid Rocket Booster) and their associated bolt catchers on the External Tank provoked a great deal of Board scrutiny.

About two minutes after launch, the firing of pyrotechnic charges breaks each forward separation bolt into two pieces, allowing the spent Solid Rocket Boosters to separate from the External Tank (see Figure 4.2-1). Two “bolt catchers” on the External Tank

v16.2 213 PQM201B Student Book

each trap the upper half of a fired separation bolt, while the lower half stays attached to the Solid Rocket Booster. As a result, both halves are kept from flying free of the assembly and potentially hitting the Orbiter. Bolt catchers have a domed aluminum cover containing an aluminum honeycomb matrix that absorbs the fired bolt’s energy. The two upper bolt halves and their respective catchers subsequently remain connected to the External Tank, which burns up on re-entry, while the lower halves stay with the Solid Rocket Boosters that are recovered from the ocean.

If one of the bolt catchers failed during STS-107, the resulting debris could have damaged COLUMBIA’s wing leading edge. Concerns that the bolt catchers may have failed, causing metal debris to ricochet toward the Orbiter, arose because the configuration of the bolt catchers used on Shuttle missions differs in important ways from the design used in initial qualification tests.1 First, the attachments that currently hold bolt catchers in place use bolts threaded into inserts rather than through-bolts. Second, the test design included neither the Super Lightweight Ablative material applied to the bolt catcher apparatus for thermal protection, nor the aluminum honeycomb configuration currently used. Also, during these initial tests, temperature and pressure readings for the bolt firings were not recorded. Instead of conducting additional tests to correct for these discrepancies, NASA engineers qualified the flight design configuration using a process called “analysis and similarity.”

v16.2 214 PQM201B Student Book

The flight configuration was validated using extrapolated test data and redesign specifications rather than direct testing. This means that NASA’s rationale for considering bolt catchers to be safe for flight is based on limited data from testing 24 years ago on a model that differs significantly from the current design.

Due to these testing deficiencies, the Board recognized that bolt catchers could have played a role in damaging COLUMBIA’s left wing. The aluminum dome could have failed catastrophically, ablative coating could have come off in large quantities, or the device could have failed to hold to its mount point on the External Tank. To determine whether bolt catchers should be eliminated as a source of debris, investigators conducted tests to establish a performance baseline for bolt catchers in their current configuration and also reviewed radar data to see whether bolt catcher failure could be observed. The results had serious implications: Every bolt catcher tested failed well below the expected load range of 68,000 pounds. In one test, a bolt catcher failed at 44,000 pounds, which was two percent below the 46,000 pounds generated by a fired separation bolt. This means that the force at which a separation bolt is predicted to come apart during flight could exceed the bolt catcher’s ability to safely capture the bolt. If these results are consistent with further tests, the factor of safety for the bolt catcher system would be 0.956 – far below the design requirement of 1.4 (that is, able to withstand 1.4 times the maximum load ever expected in operation).

Every bolt catcher must be inspected (via X-ray) as a final step in the manufacturing process to ensure specification compliance. There are specific requirements for film type/quality to allow sufficient visibility of weld quality (where the dome is mated to the mounting flange) and reveal any flaws. There is also a requirement to archive the film for several years after the hardware has been used. The manufacturer is required to evaluate the film, and a Defense Contract Management Agency representative certifies that requirements have been met. The substandard performance of the Summa bolt catchers tested by NASA at Marshall Space Flight Center and subsequent investigation revealed that the contractor’s use of film failed to meet quality requirements and, because of this questionable quality, there were “probable” weld defects in most of the archived film. Film of STS-107’s bolt catchers (serial numbers 1 and 19, both Summa- manufactured), was also determined to be substandard with “probable” weld defects (cracks, porosity, lack of penetration) on number 1 (left Solid Rocket Booster to External Tank attach point). Number 19 appeared adequate, though the substandard film quality leaves some doubt.

Further investigation revealed that a lack of qualified non-destructive inspection technicians and differing interpretations of inspection requirements contributed to this oversight. United Space Alliance, NASA’s agent in procuring bolt catchers, exercises limited process oversight and delegates actual contract compliance verification to the Defense Contract Management Agency. The Defense Contract Management Agency interpreted its responsibility as limited to certifying compliance with the requirement for X-ray inspections. Since neither the Defense Contract Management Agency nor United Space Alliance had a resident non-destructive inspection specialist, they could not read the X-ray film or certify the weld. Consequently, the required inspections of weld quality and end-item certification were not properly performed. Inadequate oversight and confusion over the requirement on the parts of NASA, United Space Alliance, and the Defense Contract Management Agency all contributed to this problem.

v16.2 215 PQM201B Student Book

In addition, STS-107 radar data from the U.S. Air Force Eastern Range tracking system identified an object with a radar cross-section consistent with a bolt catcher departing the Shuttle stack at the time of Solid Rocket Booster separation. The resolution of the radar return was not sufficient to definitively identify the object. However, an object that has about the same radar signature as a bolt catcher was seen on at least five other Shuttle missions. Debris shedding during Solid Rocket Booster separation is not an unusual event. However, the size of this object indicated that it could be a potential threat if it came close to the Orbiter after coming off the stack.

Although bolt catchers can be neither definitively excluded nor included as a potential cause of left wing damage to COLUMBIA, the impact of such a large object would likely have registered on the Shuttle stack’s sensors. The indefinite data at the time of Solid Rocket Booster separation, in tandem with overwhelming evidence related to the foam debris strike, leads the Board to conclude that bolt catchers are unlikely to have been involved in the accident.

Findings: F4.2-1 The certification of the bolt catchers flown on STS-107 was accomplished by extrapolating analysis done on similar but not identical bolt catchers in original testing. No testing of flight hardware was performed. F4.2-2 Board-directed testing of a small sample size demonstrated that the “as-flown” bolt catchers do not have the required 1.4 margin of safety. F4.2-3 Quality assurance processes for bolt catchers (a Criticality 1 subsystem) were not adequate to assure contract compliance or product adequacy. F4.2-4 An unknown metal object was seen separating from the stack during Solid Rocket Booster separation during six Space Shuttle missions. These objects were not identified, but were characterized as of little to no concern.

Recommendations: R4.2-1 Test and qualify the flight hardware bolt catchers.

v16.2 216