Analyzing Past Trends to Predict Future ACQUISITION OUTCOMES

July 2017 Vol. 24 No. 3 | ISSUE 82 Analyzing Cost Growth at Program Stages for DoD Aircraft Capt Scott J. Kozlak, USAF, Edward D. White, Jonathan D. Ritschel, Lt Col Brandon Lucas, USAF, and Michael J. Seibel

Estimating Firm-Anticipated Defense Acquisition Costs with a Value-Maximizing Framework LTJG Sean Lavelle, USN

Informing Policy through Quantification of the Intellectual Property Lock-in Associated with DoD Acquisition Maj Christopher Berardi, USAF, Bruce Cameron, and Ed Crawley

The Impact of a Big Data Decision Support Tool on Military Logistics: Medical Analytics Meets the Mission Felix K. Chang, Christopher J. Dente, and CAPT Eric A. Elster, USN

Online-only Article Beyond Integration Readiness Level (IRL): A Multidimensional Framework to Facilitate the Integration of System of Systems Maj Clarence Eder, USAF (Ret.), Thomas A. Mazzuchi, and Shahram Sarkani

Online-only Article Effectiveness Test and Evaluation of Non-lethal Weapons in Crowd Scenarios: Metrics, Measures, and Design of Experiments Elizabeth Mezzacappa, Gordon Cooke, Robert M. DeMarco, Gladstone V. Reid, Kevin Tevis, Charles Sheridan, Kenneth R. Short, Nasir Jaffery, and John B. Riedener Article List

The Defense Acquisition Professional Reading List Destructive Creation: American Business and ARJ Extra the Winning of World War II Written by Mark R. Wilson and reviewed by Benjamin Franklin Cooling

Mr. James A. MacStravic Performing the duties of Under Secretary of Defense for Acquisition, Technology, and Logistics Mr. James P. Woolsey President, Defense Acquisition University

Editorial Board Dr. Larrie D. Ferreiro Chairman and Executive Editor

Mr. Richard Altieri Dr. J. Ronald Fox Dr. Yvette Rodriguez Dwight D. Eisenhower School for Harvard Business School Defense Acquisition University NationalSecurity and Resource Strategy Mr. David Gallop Dr. Richard Shipe Dr. Michelle Bailey Defense Acquisition University Dwight D. Eisenhower School for Defense Acquisition University National Security and Resource Strategy Dr. Jacques Gansler Dr. Don Birchler University of Maryland Dr. Keith Snider Center for Naval Analyses Corporation Naval Postgraduate School RADM James Greene, USN Mr. Kevin Buck (Ret.) Dr. John Snoderly The MITRE Corporation Naval Postgraduate School Defense Acquisition University

Mr. John Cannaday Dr. Mike Kotzian Ms. Dana Stewart Defense Acquisition University Defense Acquisition University Defense Acquisition University

Dr. John M. Colombi Dr. Craig Lush Dr. David M. Tate Air Force Institute of Technology Defense Acquisition University Institute for Defense Analyses

Dr. Richard Donnelly Dr. Troy J. Mueller Dr. Trevor Taylor The George Washington University The MITRE Corporation Cranfield University (UK)

Dr. William T. Eliason Dr. Christopher G. Pernin Mr. Jerry Vandewiele Dwight D. Eisenhower School for RAND Corporation Defense Acquisition University National Security and Resource Strategy Dr. Mary C. Redshaw Dwight D. Eisenhower School for National Security and Resource Strategy

ISSN 2156-8391 (print) ISSN 2156-8405 (online) DOI: https://dx.doi.org/10.22594/dau.072017-82.24.03 The Defense Acquisition Research Journal, formerly the Defense Acquisition Review Journal, is published quarterly by the Defense Acquisition University (DAU) Press and is an official publication of the Department of Defense. Postage is paid at the U.S. Postal facility, Fort Belvoir, VA, and at additional U.S. Postal facilities. Postmaster, send address changes to: Editor, Defense Acquisition Research Journal, DAU Press, 9820 Belvoir Road, Suite 3, Fort Belvoir, VA 22060-5565. The journal-level DOI is: https://dx.doi.org/10.22594/ dauARJ.issn.2156-8391. Some photos appearing in this publication may be digitally enhanced.

Articles represent the views of the authors and do not necessarily reflect the opinion of DAU or the Department of Defense. Director, Visual Arts & Press Randy Weekes

Managing Editor Deputy Director, Norene L. Taylor Visual Arts & Press

Assistant Editor Emily Beliles

Production Manager, Frances Battle Visual Arts & Press

Lead Graphic Designer Michael Krukowski

Graphic Designer, Digital Nina Austin Publications

Technical Editor Collie J. Johnson

Associate Editor Michael Shoemaker

Copy Editor/Circulation Debbie Gonzalez Manager

Multimedia Assistant Noelia Gamboa

The C3 Group & Editing, Design, and Layout Schatz Publishing Group A Publication of the Defense Acquisition University July 2017 Vol. 24 No. 3 ISSUE 82 CONTENTS | Featured Research

p. 386 Analyzing Cost Growth at Program Stages for DoD Aircraft Capt Scott J. Kozlak, USAF, Edward D. White, Jonathan D. Ritschel, Lt Col Brandon Lucas, USAF, and Michael J. Seibel

This research examines cost growth factors from Milestone B to various program stages for 30 Department of Defense aircraft programs.

p. 408 Estimating Firm-Anticipated Defense Acquisition Costs with a Value-Maximizing Framework LTJG Sean Lavelle, USN

The author uses a value-maximizing framework to predict how firms will bid under varying levels of risk sharing, allowing the government to estimate future costs more accurately. Featured Research

p. 432 Informing Policy through Quantification of the Intellectual Property Lock-in Associated with DoD Acquisition Maj Christopher Berardi, USAF, Bruce Cameron, and Ed Crawley

This article introduces a quantitative analysis of intellectual property lock-in trends in DoD acqui- sition. The analysis ultimately quantifies the magnitude of the problem, illustrates trends across 8 fiscal years, and correlates lock-in to internal research and development funding.

p. 468 The Impact of a Big Data Decision Support Tool on Military Logistics: Medical Analytics Meets the Mission Felix K. Chang, Christopher J. Dente, and CAPT Eric A. Elster, USN

This study uses a combat simulation to demon- strate how military organizations not directly involved in logistics can use decision support tools to streamline their logistics operations. It further quantifies the benefits that one such tool designed for the military medical community might generate. A Publication of the Defense Acquisition University July 2017 Vol. 24 No. 3 ISSUE 82 CONTENTS | Featured Research

Beyond Integration Readiness Level (IRL): A Multidimensional Framework to Facilitate the Integration of System of Systems Maj Clarence Eder, USAF (Ret.), Thomas A. Mazzuchi, and Shahram Sarkani

Data analyses of research aimed at understanding major integration issues of DoD Space Systems has been validated by experts, resulting in development of an integration assessment framework to help assess Integration Readiness Level of System of Systems.

http://www.dau.mil/library/arj

Effectiveness Test and Evaluation of Non-lethal Weapons in Crowd Scenarios: Metrics, Measures, and Design of Experiments Elizabeth Mezzacappa, Gordon Cooke, Robert M. DeMarco, Gladstone V. Reid, Kevin Tevis, Charles Sheridan, Kenneth R. Short, Nasir Jaffery, and John B. Riedener

This article describes methods for quantitative metrics and analyses for test and evaluation of non-lethal weapons. Results from human testing are also presented.

http://www.dau.mil/library/arj Featured Research CONTENTS | Featured Research p. viii 2018 Edward Hirsch Aqcuisition and Writing Award Competition p. x From the Chairman and Executive Editor p. xii Research Agenda 2017–2018 p. xvii DAU Alumni Association p. 574 Professional Reading List

Destructive Creation: American Business and the Winning of World War II Written by Mark R. Wilson and reviewed by Benjamin Franklin Cooling p. 578 New Research in Defense Acquisition

A selection of new research curated by the DAU Research Center and the Knowledge Repository. p. 586 Defense ARJ Guidelines for Contributors

The Defense Acquisition Research Journal (ARJ) is a scholarly peer-reviewed journal published by the Defense Acquisition University. All submissions receive a blind review to ensure impartial evaluation. p. 594 Call for Authors

We are currently soliciting articles and subject matter experts for the 2017–2018 Defense ARJ print years. p. 596 Defense Acquisition University Website

Your online access to acquisition research, consulting, information, and course offerings.

DAU ALUMNI ASSOCIATION 2018 EDWARD HIRSCH ACQUISITION AND WRITING AWARD COMPETITION

CALL FOR PAPERS Research topics may include: trolling Costs Throughout the • Improving Professionalism of the Product Life Cycle Total Acquisition Workforce • Acquisition of Defense • Career Path and Incentives Business Systems • Agile Program • Emerging changes to EVM • Agile Software Development • System Cyber Hardness • Incorporating Foreign Military • Cyber Training and Concepts Sales and Direct Contractor Sales • Services Management Strategies into ProgramsCon- • Should Cost Management

GROUND RULES • The competition is open to anyone • The format of the paper must be interested in the DoD acquisition sys- in accordance with guidelines for tem and is not limited to government articles submitted for publication in or contractor personnel. the Defense Acquisition Research Journal. • Employees of the federal government (including military personnel) are en- • Papers are to be submitted to couraged to compete and are eligible the DAU Director of Research: for cash awards unless the paper was [email protected]. researched or written as part of the employee’s official duties or was done • Papers will be evaluated by a pan- on government time. If the research el selected by the DAUAA Board of effort is performed as part of official Directors and the DAU Director of duties or on government time, the Research. employee is eligible for a non-cash prize, i.e., certificate and donation • Award winners will present their of cash prize to a Combined Federal papers at the DAU Acquisition Campaign-registered charity of win- Community Training Symposium, ner’s choice. Tuesday, April 3, 2018, at the DAU Fort Belvoir campus. • First prize is $1,000. Second and third prizes, if awarded, are each $500. • Papers must be submitted by De- cember 15, 2017, and awards will be announced in January 2018. A Publication of the Defense Acquisition University http://www.dau.mil

FROM THE CHAIRMAN AND EXECUTIVE EDITOR Dr. Larrie D. Ferreiro

The theme for this edition of Defense Acquisition Research Journal is “Using Past Trends to Predict Future Acquisition Outcomes.” The first article is “Analyzing Cost Growth at Program Stages for DoD Aircraft” by Scott J. Kozlak et al. They analyzed 30 military aircraft programs to determine when cost growth occurred during the acquisition and development cycle, and developed some unique and use- ful insights that future programs can use for prediction. The next article by Sean Lavelle, “Estimating Firm- Anticipated Defense Acquisition Costs with a Value-Maximizing Framework,” uses a value-maximizing framework to predict how firms will bid under varying levels of risk sharing, allowing the gov- ernment to estimate future costs more accurately.

Following this is “Informing Policy through Quantification of the Intellectual Property Lock-in Associated with DoD Acquisition,” by Christopher Berardi, Bruce Cameron, and Ed Crawley, which quantitatively analyzes intellectual property lock-in trends in DoD acquisition and their correlation to internal research and devel- opment funding. Then, Felix K. Chang, Christopher J. Dente, and Eric A. Elster, in “The Impact of a Big Data Decision Support Tool on Military Logistics: Medical Analytics Meets the Mission,” describe a combat simulation tool that showed how to reduce the logistical footprint for blood resupply in a military theatre of operations. xii July 2017

This issue has two online-only papers. First, “Beyond Integration Readiness Level (IRL): A Multidimensional Framework to Facilitate the Integration of System of Systems” by Clarence Eder, Thomas A. Mazzuchi, and Shahram Sarkani, looks at expand- ing the current acquisition practice of characterizing systems by their Technology Readiness Level (TRLs) by using the concept of Integration Readiness Level (IRLs) to address growing inte- gration challenges of System-of-Systems acquisition programs. The second, by Elizabeth Mezzacappa and her co-authors, is titled “Effectiveness Testing and Evaluation of Non-lethal Weapons in Crowd Scenarios: Metrics, Measures, and Design of Experiments.” As the name implies, it discusses test and evaluation methods for benchmarking and comparison of non-lethal weapons intended for use in crowd management situations, the results of which can be used for Analysis of Alternatives and trade-space studies.

The featured book in this issue’s Defense Acquisition Professional Reading List is Destructive Creation: American Business and the Winning of World War II by Mark R. Wilson, reviewed by Dr. Benjamin Franklin Cooling of the National Defense University.

Finally, there are several changes to the Defense ARJ masthead. Sharp-eyed readers will have noticed over the past year that the Research Advisory Board, which had been established to review and provide direction for the research agenda and publications, had been steadily diminishing in size as many of the Board mem- bers departed their senior-level positions during the last year of the previous administration. At the same time, the responsibilities for providing direction to defense acquisition research has been increasingly borne by the Editorial Board, whose makeup is now two-thirds non-Defense Acquisition University members (including international representation). This Editorial Board arrangement now brings the same type of outside experience and perspectives as did the Research Advisory Board. This has led us to disestablish the Research Advisory Board, with its former functions now subsumed by the Editorial Board.

Dr. Mary Redshaw, the last “surviving” member of the Research Advisory Board, has now joined the Editorial Board. Dr. Yvette Rodriguez has also joined, while Dr. Andre Murphy has departed. We thank the former members for their service, and welcome the new ones to continue the strong tradition of advancing the state of knowledge in the defense acquisition community.

xiii A Publication of the Defense Acquisition University http://www.dau.mil

DAU CENTER FOR DEFENSE ACQUISITION RESEARCH AGENDA 2017–2018

This Research Agenda is intended to make researchers aware of the topics that are, or should be, of particular concern to the broader defense acquisition community within the federal government, academia, and defense industrial sectors. The center compiles the agenda annually, using inputs from subject matter experts across those sectors. Topics are periodically vetted and updated by the DAU Center’s Research Advisory Board to ensure they address current areas of strategic interest.

The purpose of conducting research in these areas is to provide solid, empirically based findings to create a broad body of knowl- edge that can inform the development of policies, procedures, and processes in defense acquisition, and to help shape the thought lead- ership for the acquisition community. Most of these research topics were selected to support the DoD’s Better Buying Power Initiative (see http://bbp.dau.mil). Some questions may cross topics and thus appear in multiple research areas.

Potential researchers are encouraged to contact the DAU Director of Research ([email protected]) to suggest additional research questions and topics. They are also encouraged to contact the listed Points of Contact (POC), who may be able to provide general guidance as to current areas of interest, potential sources of infor- mation, etc. xiv July 2017

Competition POCs • John Cannaday, DAU: [email protected]

• Salvatore Cianci, DAU: [email protected]

• Frank Kenlon (global market outreach), DAU: frank. [email protected]

Measuring the Effects of Competition • What means are there (or can be developed) to measure the effect on defense acquisition costs of maintaining the defense industrial base in various sectors?

• What means are there (or can be developed) of mea- suring the effect of utilizing defense industrial infrastructure for commercial manufacture, and in particular, in growth industries? In other words, can we measure the effect of using defense manufacturing to expand the buyer base?

• What means are there (or can be developed) to deter- mine the degree of openness that exists in competitive awards?

• What are the different effects of the two best value source selection processes (trade-off vs. lowest price technically acceptable) on program cost, schedule, and performance?

Strategic Competition • Is there evidence that competition between system portfolios is an effective means of controlling price and costs?

• Does lack of competition automatically mean higher prices? For example, is there evidence that sole source can result in lower overall administrative costs at both the government and industry levels, to the effect of lowering total costs?

• What are the long-term historical trends for compe- tition guidance and practice in defense acquisition policies and practices?

xv A Publication of the Defense Acquisition University http://www.dau.mil

• To what extent are contracts being awarded non- competitively by congressional mandate for policy interest reasons? What is the effect on contract price and performance?

• What means are there (or can be developed) to deter- mine the degree to which competitive program costs are negatively affected by laws and regulations such as the Berry Amendment, Buy American Act, etc.?

• The DoD should have enormous buying power and the ability to influence supplier prices. Is this the case? Examine the potential change in cost performance due to greater centralization of buying organizations or strategies.

Effects of Industrial Base • What are the effects on program cost, schedule, and performance of having more or fewer competitors? What measures are there to determine these effects?

• What means are there (or can be developed) to measure the breadth and depth of the industrial base in various sectors that go beyond simple head-count of providers?

• Has change in the defense industrial base resulted in actual change in output? How is that measured?

Competitive Contracting • Commercial industry often cultivates long-term, exclu- sive (noncompetitive) supply chain relationships. Does this model have any application to defense acquisition? Under what conditions/circumstances?

• What is the effect on program cost, schedule, and performance of awards based on varying levels of competition: (a) “Effective” competition (two or more offers); (b) “Ineffective” competition (only one offer received in response to competitive solicitation); (c) split awards vs. winner take all; and (d) sole source.

xvi July 2017

Improve DoD Outreach for Technology and Products from Global Markets • How have militaries in the past benefited from global technology development?

• How/why have militaries missed the largest techno- logical advances?

• What are the key areas that require the DoD’s focus and attention in the coming years to maintain or enhance the technological advantage of its weapon systems and equipment?

• What types of efforts should the DoD consider pursu- ing to increase the breadth and depth of technology push efforts in DoD acquisition programs?

• How effectively are the DoD’s global science and tech- nology investments transitioned into DoD acquisition programs?

• Are the DoD’s applied research and development (i.e., acquisition program) investments effectively pursuing and using sources of global technology to affordably meet current and future DoD acquisition program requirements? If not, what steps could the DoD take to improve its performance in these two areas?

• What are the strengths and weaknesses of the DoD’s global defense technology investment approach as compared to the approaches used by other nations?

• What are the strengths and weaknesses of the DoD’s global defense technology investment approach as compared to the approaches used by the private sector—both domestic and foreign entities (compa- nies, universities, private-public partnerships, think tanks, etc.)?

• How does the DoD currently assess the relative benefits and risks associated with global versus U.S. sourcing of key technologies used in DoD acquisition programs? How could the DoD improve its policies and procedures in this area to enhance the benefits of global technology sourcing while minimizing potential risks?

xvii A Publication of the Defense Acquisition University http://www.dau.mil

• How could current DoD/U.S. Technology Security and Foreign Disclosure (TSFD) decision-making policies and processes be improved to help the DoD better bal- ance the benefits and risks associated with potential global sourcing of key technologies used in current and future DoD acquisition programs?

• How do DoD primes and key subcontractors currently assess the relative benefits and risks associated with global versus U.S. sourcing of key technologies used in DoD acquisition programs? How could they improve their contractor policies and procedures in this area to enhance the benefits of global technology sourcing while minimizing potential risks?

• How could current U.S. Export Control System deci- sion-making policies and processes be improved to help the DoD better balance the benefits and risks associated with potential global sourcing of key tech- nologies used in current and future DoD acquisition programs?

Comparative Studies • Compare the industrial policies of military acquisition in different nations and the policy impacts on acquisi- tion outcomes.

• Compare the cost and contract performance of highly regulated public utilities with nonregulated “natu- ral monopolies,” e.g., military satellites, warship building, etc.

• Compare contracting/competition practices between the DoD and complex, custom-built commercial prod- ucts (e.g., offshore oil platforms).

• Compare program cost performance in various market sectors: highly competitive (multiple offerors), limited (two or three offerors), monopoly?

• Compare the cost and contract performance of mil- itary acquisition programs in nations having single “purple” acquisition organizations with those having Service-level acquisition agencies. xviii DAU ALUMNI ASSOCIATION Join the Success Network! The DAU Alumni Association opens the door to a worldwide network of Defense Acquisition University graduates, faculty, staff members, and defense industry representatives—all ready to share their expertise with you and benefit from yours. Be part of a two-way exchange of information with other acquisition professionals. • Stay connected to DAU and link to other professional organizations. • Keep up to date on evolving defense acquisition policies and developments through DAUAA newsletters and the DAUAA LinkedIn Group. • Attend the DAU Annual Acquisition Training Symposium and bi-monthly hot topic training forums—both supported by the DAUAA—and earn Continuous Learning Points toward DoD continuing education requirements.

Membership is open to all DAU graduates, faculty, staff, and defense industry members. It’s easy to join right from the DAUAA Website at www.dauaa.org, or scan the following QR code:

For more information call 703-960-6802 or 800-755-8805, or e-mail [email protected]. ISSUE 82 JULY 2017 VOL. 24 NO. 3 We’re on the Web at: http://www.dau.mil/library/arj 385 Analyzing Cost Growth at PROGRAM STAGES FOR DOD AIRCRAFT

Capt Scott J. Kozlak, USAF, Edward D. White, Jonathan D. Ritschel, Lt Col Brandon Lucas, USAF, and Michael J. Seibel

This research examines Cost Growth Factors (CGF) at various program stages for 30 Department of Defense aircraft programs. From Milestone (MS) B, the authors determine CGFs at the Critical Design Review (CDR), First Flight (FF), Development Test and Evaluation End, Initial Operational Capability (IOC), and Full Operational Capability. They find development CGFs are significantly larger than CGFs. Additionally, cost growth primarily occurs early in the program. At CDR, which occurs on average at the 12 percent completion point of a program, aircraft programs had already experienced on average 15 percent of their total program cost growth. The first spike in percent of total cost growth occurs at FF, 35 months or ~3 years after MS B. Lastly, the analysis shows that by IOC (approximately 6.5 years after MS B or 48 percent of program completion), an aircraft program realizes 91 percent of its total cost growth.

DOI: https://doi.org/10.22594/dau.16-763.24.03 Keywords: Cost Growth, Development, Procurement, Analysis, Program Costs

 Image designed by Michael Krukowski Analyzing Cost Growth at Program Stages http://www.dau.mil

Cost growth in weapon systems creates challenges for the Department of Defense (DoD) and often forces difficult decisions regarding acquisition funding. [As noted by Cancian (2010), cost professionals sometimes make a distinction between cost growth and cost overrun; cost growth is more gen- eral and is the term used here.] Such choices could include removing funding from smaller programs, postponing program development, or eliminating programs altogether. To better understand cost growth, the cost community needs to know when cost growth is most likely to occur. This knowledge might allow for better planning of advanced contingencies when a program will need to draw additional funds from another source, more robust baseline estimates to reflect a truly more realistic depiction of the weapon system’s future, or proactive measures in place to mitigate the anticipated spike in cost growth for a program. Supporting this need for knowledge, previous research finds that government decisions account for more than two-thirds of the cost growth experienced in major defense programs (Bolten, Leonard, Arena, Younossi, & Sollinger, 2008). With a total portfolio consisting of 78 programs totaling $1.4 trillion (in Fiscal Year 2015 dollars), even small cost growth percentages have large dollar impacts (Government Accountability Office, 2015). Thus, insight into when cost growth occurs provides valuable data for better informed government decisions that hopefully result in reduced future program costs.

All Major Defense Acquisition Programs (MDAP) are required to pass specific reviews and milestones. Five common stages all major aircraft pro- grams proceed through are: Critical Design Review (CDR), First Flight (FF), Development Test & Evaluation End (DTE), Initial Operational Capability (IOC), and Full Operational Capability (FOC).

388 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

Additionally, the Defense Acquisition System has formal milestones (MS A, MS B, and MS C) that all MDAPs must complete. Although previ- ous research has extensively studied cost growth in MDAPs, no one to our knowledge has analyzed cost growth utilizing program stages as bench- marks. Thus, we assess program cost growth from MS B (official program initiation) to CDR, FF, DTE, IOC, and FOC for the MDAP’s development, procurement, and total acquisition phases. The analysis is limited to DoD aircraft systems as reported in Selected Acquisition Reports (SAR, 2015).

Background Since 1969, MDAPs are required to annually submit SARs to Congress (Drezner, Jarvaise, Hess, Hough, & Norton, 1993; Porter et al., 2009). SARs outline a weapon system’s status and report current funding estimates as well as actual expenses incurred. Our research utilizes SAR data to evalu- ate program cost estimates and actual costs incurred. Three cost estimates exist within SARs: Planning Estimate (PE), Development Estimate (DE), and Current Estimate (CE) (Calcutt, 1993). PEs are the DoD estimate made during the Concept Exploration and Definition stage (current DoD Instruction 5000.02 now refers to this as Materiel Solution Analysis and Technology Maturation and Risk Reduction phases) of the program life cycle. DEs occur at MS B or the start of and Manufacturing Development phase of the program life cycle. If a program is complete, the final CE is the actual cost of the program (Calcutt, 1993).

Analysts calculate cost growth from a baseline estimate: the PE, DE, or CE. Typically, the DE at MS B is the baseline estimate for cost growth because an MS B decision results in formal program initiation. As formal cost reports materialize, cost growth becomes easier to track, and it is for this reason the estimator measures cost growth from the DE when possible. Our research investigates cost growth as defined as the increase in cost from the DE to the CE or final estimate of the DoD program

Researchers primarily use two common methods for calculating cost growth. The first method (Equation 1) calculates cost growth as a per- centage of the original cost estimate. In the first method, the estimated cost is subtracted from the actual cost and then divided by the estimated cost (McNichols & McKinney, 1981). The second method (Equation 2) calculates cost growth as a cost growth factor (CGF). The CGF method divides the cost variance (actual) by the estimate (Arena, Leonard, Murray, & Younossi, 2006). A CGF of 1.0 indicates the program did not go over or under the cost estimate, and the actual cost matched the estimated cost.

Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 389 Analyzing Cost Growth at Program Stages http://www.dau.mil

If the CGF is greater than 1.0, the program sustained growth, calculated as the CGF – 1.0 to determine the percent cost growth. Conversely, if the CGF is less than 1.0, the program did not sustain cost growth; rather, the program cost less than the estimate. For this article, we use CGF (2) to assess cost growth for DoD aircraft programs.

(Actual—Estimated) (1) Estimated

Actual (2) Estimated

Previous researchers on DoD MDAP cost growth include Drezner et al. (1993), Christensen (1994), and Arena et al. (2006). Drezner et al. (1993) studied 128 weapon systems utilizing SAR DEs as a baseline. Their research studied CGFs of weapon systems during development, procurement, and total program duration. After accounting for inflation and quantity, they determined individual weapon system cost growth increases on average 2.2 percent per year or about 20 percent through the life of a program. Drezner et al. (1993) also discovered development CGFs were 7 percent greater than procurement CGFs.

Christensen (1994) used (EVM) data to deter- mine the difference between the original budgeted amount and the estimate at completion. Using EVM in his analysis of cost overrun in DoD weapon systems, Christensen states that cost overruns begin appearing at the 10 percent program completion point. Examining aircraft specific pro- grams, he also discovered that approximately 75 percent of cost overrun occurs by the 50 percent program completion point.

Lastly, the research of Arena et al. (2006) provides information on CGFs for 68 completed programs with similar complexities to programs acquired by the U.S. Air Force. They defined completed weapon systems as systems that have greater than 90 percent production complete. Using SAR reports, Arena et al. divided the data into funding categories, milestones, and commodity type to account for possible changes in correlation with CGFs. The funding categories focused on development and procurement, while the MS category primarily focused on MS B and MS C. The major findings from them include significant cost growth at MS B and MS C, and that completed

390 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

programs reported 46 percent and 16 percent growth, respectively. The two reported CGFs illustrate cost growth bias decreases as a program moves toward completion.

Methodology Our research uses SAR data to analyze cost growth at five program stages: CDR, FF, DTE, IOC, and FOC. Specifically, we focus on DoD aircraft programs, which is defined as a fixed-wing, manned aircraft developed for one or more of the U.S. DoD Service branches. Furthermore, our analysis includes only Acquisition Category I (ACAT I) aircraft programs. All ACAT I programs are MDAPs. An MDAP is a program that is not a highly sensitive classified program and that is designated by the Under Secretary of Defense for Acquisition, Technology, and Logistics (USD AT&L) as an MDAP; or that is estimated to require eventual expenditure for research, development, test and evaluation, including all planned increments, of more than $480 million (Fiscal Year 2014 constant dollars) or procurement, including all planned increments, of more than $2.79 billion (Fiscal Year 2014 constant dollars) (Acquisition Category, 2015).

Our research utilizes a database originally built by the RAND Corporation for the Air Force Cost Analysis Agency (AFCAA). This database is populated with SAR data on approximately 330 defense acquisition programs dating back to the 1950s and provides annual funding reports by appropriation as well as calculated cost growth measures (Arena et al., 2006). In addition to the SARs, the AFCAA database also includes significant program dates for DoD aircraft programs: MS B, CDR, FF, DTE, IOC, and FOC. In conjunc- tion with the SAR and AFCAA information, we also use Deagel. Deagel is a nongovernment database that tracks civilian and military aircraft data. If the SAR and AFCAA database lacked a particular stage date for a pro- gram, we reference Deagel. For the B-1B, F-15E, and T-45, we use Deagel’s listed dates for IOC, while for the C-17 we use the listed CDR date. Our final dataset includes 30 DoD aircraft platforms.

We normalize all cost data in order to ensure valid comparisons. The two variables with the biggest effect on cost growth are inflation and order quan- tity (Drezner et al., 1993). The standard approach to account for inflation is to convert all dollars to a single base year value. For each aircraft program, we use the MS B year as the base year for the inflation rates to standard- ize CGFs. We utilize the Office of the Secretary of Defense Comptroller Appropriation (APN) inflation rates to perform these conversions.

391 Analyzing Cost Growth at Program Stages http://www.dau.mil

Next we normalize for order quantity. SARs list the quantities estimated and produced for each aircraft program. The quantities each aircraft pro- gram produces typically shift throughout a program’s life cycle. In order to standardize the units produced for each aircraft program, the units are standardized to the final production amount. The method used in this article is the same method RAND adopted (Arena et al., 2006). The standardiza- tion process uses learning curves and first unit cost, which are derived from annual funding data provided in each program SAR. If the quantity reported in the baseline estimate is less than the final quantity, we calculate the cost of units not produced and add that value to the baseline estimate. Likewise, if the final quantity produced is less than the baseline estimate, we calculated the estimated cost of additional baseline units and subtract that value from the baseline estimate (Arena et al., 2006)

Lastly, we compute percentage completion of each program. In order to calculate the percentage completion of a program, it is necessary to first identify program completion dates. We use the final reported SAR to sig- nal program completion. The final SAR identifies when all production is complete. However, the Office of the USD (AT&L) can consider terminating the requirement to report SARs when 90 percent of production units are complete or when a program is no longer considered an ACAT I program (SAR, 2015). Because it is uncertain if termination of SAR reports occurs at 90 percent completion or at final production completion, we use the anticipated date of the last production unit completion (in the last SAR report submitted for the program) as the FOC and calculate the percent of completion based off that date.

Analysis Table 1 displays the complete CGFs for the 30 aircraft weapon systems in the analysis. The table outlines the acquisition phase: development (DEV), procurement (PROC), or total (TOT), and the program stages: CDR, FF, DTE, IOC, and FOC for each CGF. The blank fields in Table 1 are attributable to either a program not completing a specific stage at the time of this analysis, the program fell below a SAR reporting threshold and no longer required annual reports, or we were unable to find a recorded date for that stage. For example, the F-35 has yet to complete Development Test & Evaluation and the B-1A fell below a reporting threshold in 1978 and was no longer required to make annual SAR reports; therefore, these fields are blank in Table 1.

392 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 Analyzing Cost Growth at Program Stages http://www.dau.mil July 2017

In addition to the CGFs, we calculate the percent of program completion FIGURE 3. TOTAL COST GROWTH FACTORS BY PERCENT COMPLETE at each stage for all 30 aircraft programs in our database. We plot these FOR 30 AIRCRAFT PROGRAMS CGFs and associated percent program completion in Figures 1–3. Despite 2.4 the large variability, the points are generally clustered in order of CDR, FF, and finally FOC. 2.2 2

FIGURE 1. DEVELOPMENT COST GROWTH FACTORS BY PERCENT 1.8

COMPLETE FOR 30 AIRCRAFT PROGRAMS 1.6 2.6 1.4 Cost Growth Factor

2.4 1.2

2.2 1 0 20 40 60 80 100

2 Percent Complete — Category = CDR — Category = FF — Category = DTE — Category = IOC — Category = FOC

1.8

1.6 To draw more macro trends, we derive summary descriptive statistics using Cost Growth Factor a statistical discovery software called JMP® (Version 11.2). We examine 1.4 how many programs and the percentage of these that sustain cost growth; 1.2 the mean, median, standard deviation, and interquartile range (the differ-

1 ence between the 75 percentile and the 25 percentile); and minimum and 0 20 40 60 80 100 maximum CGFs at each stage. Table 2 displays these statistics. Percent Complete — Category = CDR — Category = FF — Category = DTE — Category = IOC — Category = FOC As shown in Table 2, nearly 50 per- cent of the aircraft programs in our database indicate cost growth at FIGURE 2. PROCUREMENT COST GROWTH FACTORS BY PERCENT CDR. This percent increases to COMPLETE FOR 30 AIRCRAFT PROGRAMS approximately 75 percent or 2.2 more experiencing cost growth

2 at FF, DTE, IOC, or FOC. These trends are similar regardless 1.8 of acquisition phase. Given this preponderance of cost 1.6 growth, we focus the remain- 1.4 ing analysis on those CGFs Cost Growth Factor greater than 1.0, where cost 1.2 growth does occur, to further

1 identify macro trends. 0 20 40 60 80 100 Percent Complete — Category = CDR — Category = FF — Category = DTE — Category = IOC — Category = FOC

393 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 393 Analyzing Cost Growth at Program Stages http://www.dau.mil July 2017

TABLE 1. COST GROWTH FACTORS (CGF) FOR 30 AIRCRAFT WEAPON SYSTEMS IN THE ANALYSIS

DEV DEV DEV DEV DEV PROC PROC PROC PROC PROC TOT TOT TOT TOT TOT Aircraft Program CDR FF DTE IOC FOC CDR FF DTE IOC FOC CDR FF DTE IOC FOC A-10 1.09 1.19 1.18 1.19 1.27 1.03 1.22 1.28 1.22 1.34 1.03 1.22 1.28 1.22 1.33 AV-8B 1.00 1.01 1.21 1.20 1.30 1.00 1.03 0.98 0.86 0.92 1.00 1.03 1.02 0.91 0.98 B-1A 0.96 1.15 1.10 1.00 1.11 1.21 0.99 1.12 1.20 B-1B 1.05 1.05 1.17 1.16 1.31 0.99 0.98 0.96 0.96 0.98 1.00 0.99 0.99 0.98 1.02 B-1B CMUP Comp 0.98 0.97 1.02 1.00 0.95 1.00 0.84 1.16 1.02 0.95 0.99 1.07 1.16 1.08 1.07 B-1B CMUP JDAM 0.85 0.80 0.77 0.77 0.77 1.05 1.09 1.05 1.05 1.02 0.94 0.90 0.88 0.88 0.87 B-2 RMP 0.88 0.81 1.02 0.93 0.93 1.00 0.99 1.14 1.06 1.04 0.93 0.89 1.07 0.99 0.98 C-5 RERP 0.87 0.97 1.00 1.02 1.04 1.03 1.00 1.22 1.01 0.99 1.00 1.21 C-17 1.22 1.36 1.41 1.54 1.81 1.08 1.31 1.29 1.45 1.72 1.05 1.33 1.47 1.47 1.75 E-2D 1.00 1.06 1.26 1.50 1.00 1.09 1.31 1.27 1.00 1.08 1.30 1.33 E-3A AWACS 1.52 1.55 1.49 1.71 1.31 1.33 1.32 1.28 1.38 1.41 1.28 1.43 E-3 AWACS RSIP 1.02 1.07 1.07 1.07 1.07 1.00 1.41 1.57 2.05 2.06 1.01 1.21 1.27 1.45 1.46 E-6A 1.11 1.12 1.11 1.12 1.12 0.97 0.78 0.81 0.82 0.90 1.02 0.87 0.90 0.91 0.98 E-8 JSTARS 0.98 1.22 2.13 2.12 2.41 1.92 1.92 1.87 1.90 1.86 1.06 1.55 2.01 2.02 2.15 EA-18G 1.05 1.08 1.04 1.04 1.05 1.04 1.04 1.02 1.05 1.05 1.05 1.05 EF-111A 0.97 1.38 1.48 2.10 2.10 0.92 1.53 1.62 1.62 1.62 0.93 1.48 1.60 1.79 1.79 F-14A 1.32 1.32 1.47 1.48 1.83 1.03 1.03 1.19 0.92 1.18 1.08 1.08 1.24 1.02 1.29 F-15 0.98 0.98 1.09 1.09 1.37 1.05 1.04 1.32 1.23 1.28 1.03 1.03 1.19 1.19 1.30 F-15E 1.07 1.07 1.09 1.09 1.48 1.00 1.00 1.01 1.01 1.01 1.02 1.03 1.04 1.04 1.34 F-16A/B 1.00 1.25 1.28 1.31 2.51 1.00 1.12 1.10 1.13 1.08 1.00 1.13 1.12 1.15 1.27 F-18A/B 1.08 1.11 1.15 1.15 1.36 1.02 1.11 1.33 1.35 1.45 1.03 1.11 1.29 1.31 1.43 F-18E/F 0.99 0.95 0.98 0.98 0.98 1.00 1.02 0.96 0.95 1.01 1.00 1.01 0.96 0.96 1.01 F-22 1.12 1.19 1.50 1.47 1.64 1.03 1.10 1.61 1.46 1.62 1.05 1.13 1.58 1.47 1.63 F-22 Inc 3.2B 0.99 0.99 0.99 F-35 (CTOL) 1.25 1.24 1.53 1.36 1.36 1.82 1.26 1.30 1.69 F-35 (CV) 1.24 1.50 1.53 1.36 1.66 1.70 1.34 1.63 1.62 P-8A 0.96 0.99 1.11 1.12 1.00 1.01 0.95 0.95 0.99 1.01 1.00 1.00 S-3A 1.08 1.10 1.08 1.09 1.00 1.02 1.00 1.06 1.02 1.04 1.02 1.07 T-6 1.02 1.02 0.84 0.86 0.90 1.00 1.00 1.42 1.44 1.47 0.99 0.99 1.13 1.36 1.41 T-45 1.07 1.09 1.31 1.31 1.53 1.10 1.21 1.50 1.48 1.70 1.10 1.20 1.48 1.48 1.68 Note. Table 1 outlines the Acquisition Phase: Development (DEV), Procurement (PROC), or Total (TOT), and the Program Stages: Critical Design Review (CDR), First Flight (FF), Development Test and Evaluation End (DTE), Initial Operational Capability (IOC), and Full Operational Capability (FOC) for each CGF. AWACS = Air Warning and Control System; CMUP = Conventional Mission Upgrade Program; Comp = Computer Upgrade; CTOL = Conventional Take-off and Landing; CV = Carrier Variant; Inc = Increment; JDAM = Joint Direct Attack Munition; JSTARS = Joint Surveillance Target Attack Radar System; RERP = Reliability Enhancement and Re-Engining Program; RMP = Radar Modernization Program; RSIP = Radar System Improvement Program.

394 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 394 Analyzing Cost Growth at Program Stages http://www.dau.milJuly 2017 1.32 1.52 2.13 2.12 2.51 1.92 1.92 1.87 2.05 2.06 1.34 1.63 2.01 2.02 2.15 Max 0.85 0.80 0.77 0.77 0.77 0.92 0.78 0.81 0.82 0.90 0.93 0.87 0.88 0.88 0.87 Min 0.11 0.23 0.30 0.44 0.64 0.05 0.26 0.35 0.46 0.61 0.06 0.21 0.31 0.46 0.44 IQR 0.11 0.18 0.27 0.32 0.47 0.19 0.24 0.26 0.33 0.33 0.09 0.19 0.26 0.29 0.32 Standard Standard Deviation 1.01 1.08 1.13 1.16 1.31 1.00 1.09 1.18 1.22 1.21 1.01 1.08 1.15 1.20 1.30 Median 1.04 1.12 1.21 1.26 1.41 1.07 1.15 1.22 1.26 1.29 1.03 1.13 1.21 1.25 1.32 Mean 50% 76% 85% 82% 78% 46% 76% 77% 75% 83% 54% 79% 77% 75% 83% % Programs w/ Cost Growth 14 22 22 23 18 13 22 20 21 19 15 23 20 21 19 Programs w/ Cost Growth TABLE 2.TABLE DESCRIPTIVE ACQUISITION BY STATISTICS PHASE 28 29 26 28 23 28 29 26 28 23 28 29 26 28 23 Sample Size CDR FF DTE IOC FOC CDR FF DTE IOC FOC CDR FF DTE IOC FOC Category The Interquartile Range (IQR) represents the difference between the 75 percentile and the 25 percentile. 25 the and percentile 75 the between difference the represents (IQR) Range Interquartile The DEV DEV DEV DEV DEV PROC PROC PROC PROC PROC TOTAL TOTAL TOTAL TOTAL TOTAL Note. Phase

395 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 395 Analyzing Cost Growth at Program Stages http://www.dau.mil

Table 3 shows the mean and median CGFs for the five program stages (CDR through FOC) and their associated acquisition phases. Table 4 lists the mean and median percent complete and associated percent of total cost growth for each stage. (Note: 100 percent at FOC does not mean that the program has realized 100 percent cost growth; instead, it indicates that whatever cost growth the program will achieve, 100 percent of that cost growth is realized by FOC.) Equation (3) displays the formula necessary to calculate percent of total cost growth, where Stage serves as either the CDR, FF, DTE, IOC, or FOC stage. Table 4 also lists the average months from MS B and associated percent of total cost growth at each stage, again for both the means and medians.

(Stage—1) (3) (FOC—1)

TABLE 3. MEAN AND MEDIAN COST GROWTH FACTOR (CGF) AT TOTAL ACQUISITION PROGRAM PHASES

Program Stage Mean CGF Median CGF

DEV CDR 1.12 1.09

DEV FF 1.19 1.14

DEV DTE 1.26 1.18

DEV IOC 1.34 1.20

DEV FOC 1.56 1.43

PROC CDR 1.16 1.05

PROC FF 1.22 1.11

PROC DTE 1.31 1.30

PROC IOC 1.37 1.32

PROC FOC 1.37 1.28

TOTAL CDR 1.08 1.05

TOTAL FF 1.18 1.12

TOTAL DTE 1.29 1.26

TOTAL IOC 1.35 1.31

TOTAL FOC 1.40 1.34

Note. CDR = Critical Design Review; DEV = Development; FF = First Flight; DTE = Development Test and Evaluation End; FOC = Full Operational Capability; IOC = Initial Operational Capability; PROC = Procurement.

396 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

TABLE 4. AVERAGE TIME COMPLETE (PERCENT AND MONTHS) AND AVERAGE PERCENT COST GROWTH AT TOTAL ACQUISITION PROGRAM PHASES

Mean Median Mean Median Program Mean % Median % Months Months % Cost % Cost Stage Complete Complete Complete Complete Growth Growth

DEV CDR 13 24.1 12 17.2 22 20

DEV FF 26 43.6 25 34.5 33 32

DEV DTE 49 81.3 44 74.1 47 41

DEV IOC 51 88.9 48 78.1 60 47

DEV FOC 100 185.8 100 176.0 100 100

PROC CDR 13 24.1 12 17.2 44 18

PROC FF 26 43.6 25 34.5 59 39

PROC DTE 49 81.3 44 74.1 83 107

PROC IOC 51 88.9 48 78.1 101 114

PROC FOC 100 185.8 100 176.0 100 100

TOTAL CDR 13 24.1 12 17.2 19 15

TOTAL FF 26 43.6 25 34.5 45 35

TOTAL DTE 49 81.3 44 74.1 72 75

TOTAL IOC 51 88.9 48 78.1 86 91

TOTAL FOC 100 185.8 100 176.0 100 100

Note. CDR = Critical Design Review; DEV = Development; FF = First Flight; DTE = Development Test and Evaluation End; FOC = Full Operational Capability; IOC = Initial Operational Capability; PROC = Procurement.

397 Analyzing Cost Growth at Program Stages http://www.dau.mil

FIGURE 4. MEDIAN PERCENT TOTAL COST GROWTH VERSUS MEDIAN PERCENT PROGRAM COMPLETION

1.2 1 0.8 0.6 0.4 0.2 0 Percent Total Cost Growth 0 20 40 60 80 100 Percent Program Completion Procurement Total Development

Note. Applies to development, procurement, and total acquisition phases for 30 aircraft programs included in the database.

FIGURE 5. MEDIAN PERCENT TOTAL COST GROWTH VERSUS MEDIAN MONTHS FROM MS B

1.2 1 0.8 0.6 0.4 0.2 0

Median Percent Cost Growth Cost Median Percent 0 50 100 150 200 Median Months Procurement Total Development

Note. Applies to the development, procurement, and total acquisition phases for the 30 aircraft programs in the database.

398 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

Due to some large CGFs affecting the means as seen in Table 2, we primar- ily address the median values from Tables 3–4 and macro trends of those medians illustrated in Figures 4–5 in the following analysis. Investigating just the FOC stage in Table 3, the median CGFs for the development, pro- curement, and total acquisition phases are 1.43, 1.28, and 1.34, respectively. Therefore, we find that the median CGF for the development phase is sig- nificantly greater than the CGF for either the procurement phase or total acquisition phase. With respect to program cost growth, these CGFs cor- respond to 43 percent, 28 percent, and 34 percent total cost growth for the development, procurement, and total acquisition phases.

According to Table 4, the four program stages of CDR, FF, DTE, and IOC all occur before 50 percent schedule completion. Since IOC is typically the last stage (sometimes DTE has a later date) with a median percent comple- tion of 48 percent, we further analyze these results. For the procurement phase, we find the median CGF represents 114 percent of total realized cost growth at IOC. Thus, the procurement phase realizes all of its cost growth by IOC, despite IOC representing only the 48 percent program completion point. This is not all that surprising given development is mainly complete, and most actual production costs are collected and understood by the time full rate production begins. Additionally, we find that the median CGF at IOC (114 percent) is greater than the median CGF at FOC (100 percent). Visually, this peak is seen in Figure 4. A similar trend holds for the total acquisition phase, whereby 91 percent of total realized cost growth occurs at IOC. This trend for total is not as pronounced as procurement since it also accounts for development costs, which has a much smaller percent of total realized cost growth of 47 percent at IOC. With respect to actual time at IOC, Table 4 shows the median time from MS B to IOC is 78.1 months or 6.5 years. Overall, the descriptive analysis suggests that by IOC, which typically occurs 6.5 years after MS B, a program realizes 91 percent of its total cost growth.

Besides investigating trends seen at IOC, we also assess descriptive trends at CDR, FF, and DTE. For both CDR and FF, the median percent of total realized cost growth for development, procurement, and total acquisition are relatively similar, unlike those at IOC. The percentages for CDR are 20 percent, 18 percent, and 15 percent, while for FF the percentages are 32 percent, 39 percent, and 35 percent, respectively. For DTE, these per- centages start to diverge with median development, procurement, and total acquisition program cost growth at 41 percent, 107 percent, and 75 percent, respectively. Thus, DTE is found to be a pivot point for cost growth.

399 Analyzing Cost Growth at Program Stages http://www.dau.mil

Investigating the association of median cost growth percentage to median schedule completion percentage at CDR, we find they are relatively com- parable (Table 4). For CDR, the median schedule completion percentage is 12 percent, while the median percentages of total realized cost growth for development, procurement, and total acquisition are 20 percent, 18 percent, and 15 percent, respectively. For FF, this association is again somewhat comparable, but starts to weaken. At FF, the median schedule completion percentage is 25 percent, while the median percentages of total realized cost growth for development, procurement, and total acquisition are 32 percent, 39 percent, and 35 percent, respectively. At DTE, this association further weakens. The median percent of total cost growth is 75 percent, while the total percent of program completion is 44 percent. This is primarily because of the rapid increase of procurement cost. Why this occurs, we do not know; indeed, such rapid increase of procurement cost might suggest an area for future studies and investigation. As discussed previously regarding IOC, 91 percent of total cost growth occurs at 48 percent schedule completion. Overall, we see a steep rise in percent of total cost growth between FF and IOC, primarily attributable to procurement cost.

Investigating procurement costs further, we find the following trends. Median percent of total cost growth at CDR is 18 percent, while median percent of program com- pletion is 12 percent. At FF, the percent of total cost growth is 39 percent and median percent of program completion is 25 percent. It is here at FF that the percent of total cost growth begins to increase more rap- idly than percent of program completion. At DTE, percent of total cost growth is 107 percent and percent of program completion is 44 per- cent. At IOC, percent of total cost growth is 114 percent at 48 percent program com- pletion. As seen in either Figure 4 or 5, procurement experiences a large increase in percent of total cost growth around DTE and IOC.

Development percent of total cost growth does not behave the same way as procurement cost growth (Table 4). At CDR, median percent of total cost

400 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

growth is 20 percent at 12 percent program completion. FF percent total cost growth is 32 percent at a program completion percentage of 25 percent. For both CDR and FF, the percent of total cost growth compared to percent of program completion is not too different, ~7–8 percent. However, at DTE the percent of total cost growth is 41 percent and at IOC the percent of total cost growth is 47 percent. Both of these percentages of total cost growth are less than the percent of program completion and far less than the percent of total cost growth experienced with procurement at the same reviews.

Overall, the descriptive analysis highlights some macro trends. Procurement and total program cost growth display similar trends. Both experience the majority of their cost growth prior to the program being 50 percent com- plete. At IOC, median percent of total realized cost growth is 91 percent at 48 percent program completion with respect to overall total acquisition cost. For procurement acquisition only, 114 percent of total realized cost growth occurs at 48 percent program completion. The development phase is different. For development acquisition, only 47 percent of total program cost growth occurs at 48 percent program completion. Additionally, for procurement and total acquisition costs, a large spike in median percent of total program cost growth occurs around FF whereas development cost growth follows a steadier, more linear path.

Conclusions Building a database of 30 aircraft programs comprised of informa- tion gathered from the SARs, we investigated how CGFs change from CDR to FOC for development, procurement, and total acquisition phases. Despite there being much variability from program to program, as evi- dent in Figures 1–3, noticeable trends soon emerge as we aggregate the data to means and medians. As seen in Tables 2 and 3, over half to three- fourths of the aircraft programs experience median cost growth ranging from 28 percent for procurement to 43 percent for development. Thus, we find that development CGFs are significantly larger than procurement CGFs. These results are comparable to Drezner et al. (1993), who discov- ered development CGFs were 7 percent greater than procurement CGFs. For our database, the average difference was approximately 15 percent.

As previously discussed, the median percent of program completion at IOC is 48 percent and the median percent of total cost growth for total acquisition is 91 percent. Therefore, we identify the CGF of an aircraft program at IOC to be very close to the CGF at program completion.

401 Analyzing Cost Growth at Program Stages http://www.dau.mil

When comparing development to procurement, procurement is the primary contributor to overall cost growth. With respect to when this cost growth occurs, the major spike in cost growth occurs between FF and DTE. At FF, the median percent of total cost growth is 35 percent at 25 percent program completion. When looking at DTE, median total cost growth is 75 percent at 44 percent program completion. Thus, at DTE, there is a major spike in percent of total cost growth, which could be attributed to a program actu- ally needing to display some capability for the aircraft. From Figure 3, we see DTE does not necessarily occur before IOC for every program. Because of this, DTE can occur after IOC depending on where IOC is identified in a program’s Capability Development Document (CDD). Due to shifts in IOC, the point of greatest CGF could occur at DTE or IOC.

Investigating further into development and procurement, we find signifi- cantly different results for percent of total cost growth versus percent of program completion. For development, median percent of total cost growth at IOC is 47 percent whereas median percent total cost growth for procure- ment is 114 percent. With this information, we are likely to see development cost growth after IOC, but do not expect to see any procurement cost growth after IOC. Although not always, as first noted in Figure 1, IOC typically occurs after DTE. Therefore, this would indicate minimal development costs accrued after IOC. That is not what the data indicated. This is there- fore another area for future research to delve into this counterintuitive finding. The median CGFs for procurement costs spike at DTE and IOC, then slowly decrease at FOC. A similar trend holds for total acquisition costs, but the FOC CGFs are slightly higher due to development costs being more linear during a program’s schedule.

402 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

With respect to possible limitations, we noticed some discrepancies when identifying IOC dates. This is because programs are not required to report IOC at a certain point in the program’s schedule. The Defense Acquisition University’s ACQuipedia (Initial Operational Capability, 2015) defines IOC:

In general, attained when some units and/or organizations in the force structure scheduled to receive a system 1) have received it, and 2) have the ability to employ and maintain it. The specifics for any particular system IOC are defined in that system’s CDD and Capability Production Document.

IOC dates reported by aircraft programs in this research are consistent with the DAU definition, where some aircraft programs report IOC earlier in the schedule than other aircraft programs. This inconsistency of report- ing could affect our findings. However, given the magnitude of median cost growth at IOC, we doubt our findings would change drastically.

Overall, our research quantified the CGFs for 30 aircraft programs at CDR, FF, DTE, IOC, and FOC. We determined the median CGFs at FOC for development, procurement, and total acquisition to be 1.43, 1.28, and 1.34, respectively. These results are comparable to previous findings. Arena et al. (2006) found total CGFs for development, procurement, and total to be 1.58, 1.44, and 1.46 respectively, while Drezner et al. (1993) found total CGFs for development, procurement, and total to be 1.25, 1.18, and 1.20, respectively. Additionally, Christensen (1994) uses EVM data to identify cost overrun beginning as early as 10 percent of program completion. Consistent with Christensen’s findings, we identify a median cost growth of 15 percent for total acquisition costs at CDR, which occurs at a median program completion of 12 percent. Additionally, the first spike in percent of total cost growth occurs at FF—35 months or approximately 3 years after MS B. Lastly, our analysis identifies the amount of time from MS B to each program stage. The median time from MS B to IOC is 78 months or 6.5 years. Therefore, approximately 6.5 years after MS B, a program sustains about 91 percent of its total program cost growth. We submit that understanding a typical cost growth pattern of an aircraft program allows for better timing of program initiation to mitigate funding poaching from other systems when or if cost growth spikes.

403 Analyzing Cost Growth at Program Stages http://www.dau.mil

References Acquisition Category. (2015). In Defense Acquisition University ACQuipedia. Retrieved from https://dap.dau.mil/acquipedia/Pages/ArticleDetails.aspx?aid=a896cb8a- 92ad-41f1-b85a-dd1cb4abdc82 Arena, M. V., Leonard, R. S., Murray, S. E., & Younossi, O. (2006). Historical cost growth of completed weapon system programs (Report No. TR-343-AF). Santa Monica, CA: RAND. Bolten, J. G., Leonard, R. S., Arena, M. V., Younossi, O., & Sollinger, J. M. (2008). Sources of weapon system cost growth (Report No. MG-670-AF). Santa Monica, CA: RAND. Calcutt, H. M., Jr. (1993). Cost growth in DoD major programs: A historical perspective. (Executive Research Project, Report No. NDU-ICAF-93-F31). Washington, DC: Industrial College of the Armed Forces, National Defense University. Cancian, M. F. (2010). Cost growth: Perception and reality. Defense Acquisition Review Journal, 17(3), 389–404. Christensen, D. S. (1994, Winter). Cost overrun optimism: Fact or fiction? Acquisition Review Quarterly, 1, 25–38. Drezner, J. A., Jarvaise, J. M., Hess, R. W., Hough, P. G., & Norton, D. (1993). An analysis of weapon system cost growth (Report No. MR-291-AF). Santa Monica, CA: RAND. Government Accountability Office (2015). Defense acquisitions: Assessment of selected weapon programs (Report No. GAO-15-342SP). Washington, DC: U.S. Government Printing Office. Initial Operational Capability. (2015). In Defense Acquisition University ACQuipedia. Retrieved from https://dap.dau.mil/acquipedia/Pages/ArticleDetails.aspx?aid= 87a753b2-99cf-4e63-94dc-9ceab06fc96c McNichols, G. R., & McKinney, B. J. (1981, October). Analysis of DoD weapon system cost growth using selected acquisition reports. Paper presented at the Sixteenth Annual Department of Defense Cost Analysis Symposium, Arlington, VA. Porter, G., Gladstone, B., Gordon, C. V., Karvonides, N., Kneece Jr., R. R., Mandelbaum, J., & O’Neil, W. D. (2009). The major causes of cost growth in defense acquisitions (IDA Report No. P-4531). Alexandria, VA: Institute for Defense Analyses. Selected Acquisition Report. (2015). In AcqNotes [Online encyclopedia]. Retrieved from http://acqnotes.com/acqnote/acquisitions/selected-acquisition-report-sar

404 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

Author Biographies

Capt Scott J. Kozlak, USAF, is a cost analyst at the Air Force Cost Analysis Agency in the Washington, DC metropolitan area. Capt Kozlak holds a BS in Management from the United States Air Force Academy and an MS in Cost Analysis from the Air Force Institute of Technology.

(E-mail address: [email protected])

Dr. Edward D. White is a professor of statistics in the Department of Mathematics and Statistics at the Air Force Institute of Technology. He received his MAS from The Ohio State University and his PhD in Statistics from Texas A&M University. Dr. White’s primary research inter- ests include statistical modeling, simulation, and data analytics.

(E-mail address: [email protected])

405 Analyzing Cost Growth at Program Stages http://www.dau.mil

Dr. Jonathan D. Ritschel is an assistant pro- fessor of cost analysis in the Department of and Management at the Air Force Institute of Technology (AFIT). He received his BBA in Accountancy from the University of Notre Dame, his MS in Cost Analysis from AFIT, and his PhD in Economics from George Mason University. Dr. Ritschel’s research interests include public choice, the effects of acquisition reforms on cost growth in DoD weapon systems, research and development cost estimation, and economic institutional analysis.

(E-mail address: [email protected])

Lt Col Brandon Lucas, USAF, currently serves as an assistant professor and director of the Graduate Cost Analysis Program at the Air Force Institute of Technology, Wright-Patterson AFB, Ohio. He holds an MS in Cost Analysis and a PhD in Economics. Lt Col Lucas has served in the budget, cost, and finance communities at base, center, and Air Staff levels.

(E-mail address: [email protected] )

406 Defense ARJ, July 2017, Vol. 24 No. 3 : 386-407 July 2017

Mr. Michael J. Seibel retired as a senior cost analyst in the Cost Division of the Air Force Life Cycle Management Center at Wright-Patterson AFB, Ohio. During his 42-year career, he led or participated in numerous independent cost esti- mates, source selection cost panels, special studies, and program office support exercises. He was also very active in the cost research area. He holds an MS in Social and Applied Economics from Wright State University and is an International Cost Estimating and Analysis Association member.

(Can be reached at: [email protected])

407 Estimating Firm-Anticipated DEFENSE ACQUISITION COSTS WITH A VALUE-MAXIMIZING FRAMEWORK

LTJG Sean Lavelle, USN

The author develops a value-maximizing framework to analyze defense contractor behavior within both risk-sharing and cost-plus contracts. He empirically observes that as development costs rise above estimates, the government tends to cut its order of the final product. Because the firm loses money from order cuts, irrespective of contract type, the author theorizes that cost-plus contracts will behave similarly to risk-sharing contracts if the government reliably cuts orders as development costs rise. For purposes of this article, the author uses a larger than typical dataset to empirically estimate the effects of contract incentives (final order size cuts in the case of this dataset) on firm value. With these effects estimated, he establishes a methodology for determining firm expectations of research and develop- ment costs based on the anticipated impact of cost overruns on firm value.

DOI: https://doi.org/10.22594/dau.16-762.24.03 Keywords: Cost Estimating, Defense Contract, Information Asymmetry, Value-Maximizing, Bidding  Image designed by Michael Krukowski Estimating Firm-Anticipated Defense Acquisition Costs http://www.dau.mil

Publicly traded defense contractors bid and work on projects for one overriding reason: to increase their value to shareholders. When a new contract is awarded, investors look at the estimated value of the contract and decide whether to buy, sell, or hold their shares in the company. A higher valued contract might increase the stock price more greatly than a lower val- ued contract. During the life of the contract, as prices for research vary and quantities ordered increase or decrease, the estimated value of the contract will change as well. Rational investors should observe these fluctuations and make decisions about their holdings. Likewise, rational firms should only pursue those projects that will result in an increase in net value.

If we assume that the cost of research and development of a contract is predetermined and only known by the contractor, fluctuations during the contract are determined simply by the initial bid. If the initial bid is lower than actual cost, there will be cost over- runs. Alternatively, if we assume that contrac- tors have a great amount of control over their costs of development, they should fine-tune their cost performance throughout the project to maximize their value. As we develop our theory, only one of these assump- tions must be true.

For cost-plus contracts, the government pays all cost overruns and the contractor is virtually unaffected. The only way that a contractor might be immediately impacted is if there is specific language inserted into the contract detailing a risk-sharing scheme, or if the govern- ment decides to cut the number of end-product units ordered in response to overruns. While risk-sharing contracts are difficult to empirically eval- uate due to a dearth of available data, changes in quantity ordered are well documented for any Major Defense Acquisition Program (MDAP). This reduction in quantity ordered will hurt the firm as most contracts have a research phase, resulting in a predetermined, negotiated rate for the cost of each manufactured unit. Production revenue can often be a significant portion of a government contractor’s cash flow in an MDAP.

410 Defense ARJ, July 2017, Vol. 24 No. 3 : 408-431 July 2017

Whether we assume that the contractor can adjust its costs or that it bids with a fixed final cost in mind, the market will respond to new projects and fluctuations in value in the same manner. It should also not matter if the diminished contract value is due to an incentive scheme or changes in quantity ordered. This article will estimate the effect that new contracts and later amendments have on a company’s value. Once the effects are estimated, government contractors can then use them to determine optimal incentives for cost control and cost predictability.

We developed the model to take into account the market’s reaction to the ini- tial award amount, changes in development costs, and changes in the value of quantities ordered after development. Intuition tells us that the stock price will go up with a new contract award and will go down as less quantity is ordered. Our theory states that as development costs go up, the govern- ment will cut quantity ordered. Given this relationship, we expect that as development costs increase, firm value will decrease. We will compile evidence that development cost increases in a predominantly “cost-plus” environment will only affect firm value if the government changes quantity ordered in response.

After we estimate our model’s coefficients, we will develop a framework for designing optimal contract incentives. Since firms will not purposely lower their own value, we can find the ratio of quantity cuts to development cost overruns that forces the firm to either bid more closely to their estimations of cost or control costs below a certain desired level. We can understand this ratio, E1, as an elasticity where we measure the percentage change in quantity ordered for a percentage change in development costs.

Risk-Sharing Contracts Cost overruns in defense acquisition have been a problem for as long as the military has looked to industry to deliver innovative products. Most of the relevant literature views contracting for major, innovative defense projects through an insurance framework with moral hazard.

Cummins (1977) observes that prior to the 1960s, the government’s primary focus was on limiting contractor profits. To this end, cost-plus contracts, where the government bore all of the risk of cost overruns, were the pri- mary contract types. As awareness of the risk-sharing problem became better understood, the government shifted to greater use of contracts with incentives built in to share the risk between both parties. This shift of risk to the firm controlled cost overruns, but required a greater fixed fee to firms.

Defense ARJ, July 2017, Vol. 24 No. 3 : 408-431 411 The basic framework of the risk-sharing contract is π = α - (1-y) C, where π is firm profit,α is a fixed fee, y is the fraction of cost overruns borne by the government, and C is the cost overruns. To ascertain a pareto-optimal contract, Cummins built upon a profit maximizing framework with contin- gent contracts and positive externalities gained by the firm from contract completion. In this article, we will use a value-maximizing framework. The effective difference is that a firm will likely need some market-desig- nated positive profit to undertake a project in our framework, whereas a firm would accept a contract with zero accounting profit in Cummins’ model.

Cummins showed that if we take the government’s objective to be lowering total cost vice cost overruns and firm profit, risk-sharing contracts can be ineffective. Hiller and Tollison (1978) provide empirical evidence to support this theory. As we increase the risk sharing in a contract, the firm requires a greater fixed fee. Any savings from increased firm attention to cost may be offset by the increased fixed fee. We could see lower cost overruns, but the same—or even greater—total cost. It is therefore much more important to focus on controlling total cost vice cost overruns. Our model will allow the government to calculate expected total cost prior to the awarding of a contract.

Weitzman (1980) takes the analysis of risk-sharing contracts in another direction. His research presents the risk-sharing contract equation and a utility maximization framework to determine an efficient sharing ratio between the principal (government) and the agent (contractor). Weitzman’s chief result is simple: if the sharing rate is high, the firm will bid higher and in turn make bids a more reliable indicator of final cost. We will obtain a similar result, but our “sharing rate” will be between development costs and quantity cuts, vice a contracted agreement on risk sharing. Additionally, we will show our sharing rate to impact total cost, not just cost overruns.

Goel (1995) builds upon Weitzman by including an auction framework. His research develops a model where the principal designates a sharing rate, y, and then allows various agents to bid on the contract. Under the model, the principal is able to coax agents to bid closer to their expectations of cost by increasing the amount of risk borne by the agent. Some differences between Goel and our article are a lack of bidding framework in this article, but the addition of a firm value perspective along with empirical investigation. Because the bidding framework does not affect the underlying result shared between Goel and Weitzman—that a higher sharing rate results in a bid that is more reflective of cost estimations—this article’s omission of a bidding framework should not impact its results.

412 The Model Intuition tells us that markets will generally respond positively to any mention of additional profit flowing to a publicly traded firm. Conversely, any news that suggests the possibility of a loss of profit should result in a nega- tive reaction. Our model defines how markets reward and punish defense contractors for different kinds of news. We will then test our model to see if the market encompasses new information in the manner we predict.

Rational markets should reward firms for securing a new contract. The firm should enjoy a larger increase in value if the contract is larger. Our initial

model will look just at time period zero, T0, to determine how much of an effect the award has on stock price. Equation 1 shows this where P0 is the percentage change in stock price and IAt is the initially awarded contract amount divided by the value of the firm.

P0 = β1*IAt (1) Next, we need an equation that explains the change in price for all further time periods. Firm value fluctuation during the life of the contract should primarily be affected by changes in the value of the contract. Defense con- tracts adjust through two main avenues: changes in quantity ordered and changes in development costs. Many MDAPs are of the cost-plus variety, meaning that an overrun in cost will be borne by the government, not the firm. Assuming our model’s firm is undertaking a cost-plus contract, its value should only be affected by a change in quantity ordered. Equation 2 shows this where Qt is the percentage change in quantity ordered.

Pt>0 = β2*Qt (2)

We can account for all time periods with Equation 3. In T0 of Equation 3, IAt will be equal to the initial amount of the contract whereas Qt will be zero.

In all other time periods, IA will be zero whereas Qt will likely be nonzero numbers as long as the contract is still active.

Pt = β1*IAt + β2* Qt (3)

In our model, the government can decide Qt at any given time. If it requires less of the final product, Qt will be negative. If it requires more, Qt will be positive. We suspect that the primary reason that the government will

change Qt is a change in Dt, or the percentage change in development costs. As development costs go up, the government must save costs elsewhere and cut Qt. Equation 4 gives us the relationship between Qt and Dt, where E1 is

413 the amount of Qt that the government will change for a given amount of Dt.

(Note that E1 can be thought of as the elasticity of changes in quantity with respect to development cost fluctuations.) We will use this equation to empir-

ically estimate E1 in the Results section of this article.

Qt = E1*Dt (4) So that we can express our change in stock price equation only in terms controllable or estimable by the firm, we substitute Equation 4 into Equation 3 to get Equation 5.

Pt = β1*IAt + β2* E1*Dt (5) The point at which our firm will either not bid on a project given known costs (or will avoid reaching if costs are controllable) is the point where the net change in value of the firm across the project is zero. Condition 1 gives us this mathematically, where i is the number of time periods.

i-1 Condition 1: (1+P0)* ∏t=1 (1+P t) = 1 If we assume a constant change in development costs over the life of the project, we can sum Equation 5 across all time periods. This allows us to calculate the total value change across the entire project to determine the optimal government reaction to a development cost increase. We use nota- tion P to denote the net percentage change in firm value for the duration of the project. If we take D to be the total development cost growth for the duration of the project and assume constant cost growth, we get each annual

cost growth to be D1/i. The variable “i” is the number of years required for project completion.

1/i i P = [1+β1*IA] * [1+β2* E1* D ] – 1 (6)

By solving Equation 6 for E1 where P = 0 (from condition 1), we can estimate the appropriate amount that quantity should be reduced if we know the gov- ernment’s opinion of the acceptable development cost growth of the program.

1/i 1/i E1 = [D(- )]*[ -1+(1/(β1*IA+1)) ]/β2 (7)

If the government sets E1 equal to its result from Equation 7, the firm would lose net value from the project if it has greater cost growth than D. Because the firm will avoid losing net value, it will avoid having development cost growth greater than D. It can avoid such cost growth either by imple- menting cost controls where possible, or by bidding closer to the actual cost if it has a reasonable initial estimate.

414 Our predicted coefficient for all variables in all equations is displayed in Table 1. We will test Equations 3, 4, 5, and 6 empirically to estimate all relevant coefficients. We will then use those coefficients to develop our methodology for incentive creation and cost estimation from Equation 7.

Table 1 displays all of our empirically evaluated variables and our expected coefficients for each variable in each equation. The only vari- able that we have not previously discussed is SP. It represents the change in the Standard & Poor’s 500 Index (S&P 500) for the given time period. This variable will control for exogenous market variation.

TABLE 1. VARIABLE DESCRIPTION TABLE 1: VARIABLE DESCRIPTIONS

Variable Symbol Description Expected Coefficient

The Initially Awarded Contract Amount Divided by Equation 3: Positive the Market Cap of the Firm. Equation 5: Positive IA β1 Only nonzero during time Equation 6: Positive period of initial award.

Change in Cost of Quantity Divided by Initial Contract Equation 3: Positive Q β2 Amount

Change in Cost of Eq 4: β Equation 4: Negative D 3 Development Divided by Equation 5: Negative Eq 5: E1* β2 Initial Contract Amount

Change in Cost of Development Divided by Initial Contract Amount Equation 6: Negative Dq β3 if Q Changed in Same Time Period

Change in Cost of Development Divided by Initial Contract Amount if Q Equation 6: Insignificant Dn β4 did not Change in Same Time Period

Percentage Change in Positive SP β5 S&P 500

415 Data We collected data from Selected Acquisition Reports (SARs) on 20 MDAPs that are published at least annually, sometimes quarterly. The 20 MDAPs were awarded to seven separate contractors between 2000 and 2014. The SARs give top-level summaries of changes in costs categorized by Support, Quantity, Engineering, Estimation, Economic, and Schedule. We categorize Engineering, Estimation, and Schedule cost changes as developmental costs. Quantity changes will help us deter- mine our Q variable, whereas Support and Economic changes will be their own category that we suspect will have little impact on the model.

Hough (1992) discusses the difficulties associ- ated with using SARs to study cost overruns. The most significant problem is that when incentive contracts are used, the costs borne by the contractor are not identified in the SAR. The SAR only identifies costs to the government, though it is intuitive that as the development cost grows in the SAR, a firm engaged in a risk-sharing contract will bear a corresponding cost. It is therefore difficult to dissect the effect that the SAR-measured devel- opment cost overruns have on firm value and the effects of the presumed collinearity that the SAR measurement has with firm-borne costs. Since our data do not consist of completely cost-plus contracts, we sanitize our results from this effect by testing for this collinearity in three ways.

Because our data will not exclusively consist of cost-plus contracts,

we need to test whether or not Dt is impacting the stock price directly or

indirectly, as hypothesized. To do this, we can compare the value of β2* E1, as estimated from Equations 3 and 4, and compare it to our empirical esti- mation from Equation 5. If the values are close to each other, then our data conform to our theoretical model. If there is a significant difference, then a

change in Dt is having an effect on firm value independent of its impact on Qt.

The second test will be to simply regress both Qt and Dt against Pt. This will allow us to control for either variable and determine which one

is primarily driving Pt.

416 For a third way to test whether Dt is directly or indirectly changing firm value, we will look to Equation 6. We break Dt into Dqt and Dnt, where Dqt

is Dt for all cases where Qt is nonzero and Dnt is Dt for all cases where Qt is

zero. This way we break up the variable into two categories: one where Dt

plausibly could have impacted Qt and one where it could have no impact.

Pt = β1*IAt + β2*Dqt + β3*Dnt (8) We combined data from the last SAR for each year (usually from December) with the last stock quote of the corresponding company for the same year as reported by Google Finance. While it is unusual to rely on annual stock data for an event study, it is not possible in this instance to use daily, weekly, or monthly data. This is because the information included in a SAR flows to the market at varying times, while the SAR simply summa- rizes it. It is therefore problematic to decide when to designate the event when using more frequent market return data. This hurts the accuracy of this study, but does not negate its findings. For instance, Holthausen and Leftwich (1986) successfully used 300 days to define their event horizon in another event study. We additionally seek to mitigate this problem by including a larger dataset than is typical for this topic.

The predominant firm awarded each MDAP contract was used as the cor- responding company. All cost growths are noted in real dollars with the specified program’s start year as the baseline. Hough notes that simply using the real dollar values from the SAR is inadequate since initial cost estimates are based on certain inflation assumptions. To account for this, we discount the cost growth associated with economic factors.

To calculate production cost growth, we normalize to the contemporarily approved quantity, vice normalizing to the baseline-approved quantity. The selected calculation method is as reported in SARs.

In several instances, a contract was ongoing for several years and then was transformed into a new contract. Because the SAR data are very high-level and provide no amplifying information to ascertain estimated changes in quantity and development costs, we treated the change as the final year of the project. This slightly reduces the number of observations, but should not impair the validity of our findings.

Our data include 321 data points at the project level and, when aggre- gated into firm-level data, 113 data points. We perform our analysis with project-level data.

417 The following MDAPs appeared in our data:

• AESA – Active Electronically Scanned Array (Radar)

• SM-6 – Standard Missile-6 (Rocket Intercept Missile [RIM]-174 Standard Extended Range Active Missile)

• SDB II – Small Diameter Bomb II

• MH-60 – Military Helicopter-60 (Seahawk)

• Patriot – Missile

• AEHF – Advanced Extremely High Frequency (Satellite)

• ACS – Aegis Combat System (Integrated Command and Control/ Weapons Control System)

• JSF – Joint Strike Fighter (Fighter, Strike, and Ground Attack Aircraft)

• Land Warrior – Integrated Soldier System (Weapon, Helmet, Computer, Digital and Voice Communications, Positional and Navigation System, Protective Clothing, Individual Equipment)

• EFV – Expeditionary Fighting Vehicle

• Stryker – Interim Armored Vehicle

• T-AKE – Auxiliary Cargo (K) and Ammunition (E) Ship, Military Ship Classification (MSC) Manned

• Bradley Upgrade – Infantry Fighting Vehicle

• Comanche – RAH-66 Helicopter

• F/A-18E/F – Aircraft Variants (Based on McDonnell Douglas F/A- 18 Hornet)

• EA-18G – Boeing Growler (Electronic Attack Aircraft)

• CH-47F – Boeing Chinook (Twin-Engine, Tandem Rotor Heavy- Lift Helicopter)

• P-8A – Boeing Poseidon (Navy Maritime, Patrol, Reconnaissance Aircraft)

418 • FBCB2 – Force XXI Battle Command Brigade and Below (Communications Platform to Track Friendly/Hostile Forces on Battlefield)

• Global Hawk – Unmanned Aircraft System Summary statistics for our project-level data are summarized in Table 2.

TABLE 2. SUMMARY STATISTICS Project-Level P SP IA D Q Data Mean 0.144073 0.043694 0.013118 0.017552 -0.00057 Standard Error 0.013847 0.010968 0.004444 0.011839 0.002507 Median 0.145455 0.105877 0 0 0 Mode 0.535274 -0.10139 0 0 0 Standard Deviation 0.240243 0.189972 0.079493 0.211774 0.044844 Sample Variance 0.057717 0.036089 0.006319 0.044848 0.002011 Kurtosis 0.543315 0.128455 148.0833 264.9717 161.3609 Skewness 0.011312 -0.81439 10.99353 15.69839 -10.383 Range 1.314126 0.722675 1.175556 3.846937 0.912608 Minimum -0.59603 -0.40967 0 -0.22689 -0.67047 Maximum 0.718095 0.313007 1.175556 3.62005 0.242139

The key variables in our project-level data are summarized in Table 3. Strongly correlated variables are in bold and include Q with D, Q with Dq, and D with Dq. Only Q and D are both used in the same regression, and we should expect them to be negatively correlated as observed.

TABLE 3. PROJECT-LEVEL DATA CORRELATIONS

Variables IA Q D Dq Dn SP

IA 1

Q 0.0015 1

D -0.0147 -0.8073 1

Dq -0.0153 -0.8113 0.9953 1

Dn 0.0057 -0.0005 0.0996 0.0031 1

SP -0.1124 0.038 0.0527 0.056 -0.0232 1

419 Validity Test of Data Since our theory relies on the premise that development cost changes impact firm value, primarily through their impact on order quantities, we must test our data to see that they conform to this assumption. We do this through three regressions, two of which will be discussed in this section, while the third will be discussed in the Results section. For the first test,

we will split Dt into two variables: one that has value when Qt moves in the

same time period, Dqt, and one that has value when Qt does not move in

the same time period, Dnt, as in Equation 8. Table 4 defines and shows the results of this test regression.

TABLE 4. CATEGORIZED Dt REGRESSION

Pt =β 1*IA t+β 3*Dq t+β 4*Dn t+β 5*SP t+Constant With Constant No Constant .6584133 .7914403 SP t (.000) (.000) .4269925 .675944 IA t (.004) (.000) -.1238065 -.0868289 Dq t (.024) (.158) -.26261614 -.3981645 Dn t (.641) (.530) .1432938 N/A Constant (.000) R2 .2825 .3647 Observations 300 300 Note: Parentheses contain P-Values.

Dqt is significant at the 95 percent level in our first model and significant at the 90 percent level in the second. The more interesting result is that

Dnt is completely insignificant in both models, suggesting that a change in development costs without a change in quantity ordered has no impact on firm value. In other words, without a corresponding change in quantity ordered, variations in development costs are not associated with a change in firm value.

Our second test for data validity will be to regress both Dt and Qt against Pt to

ascertain whether Dt becomes insignificant when controlled for Qt. We will use firm-level data for this regression, as project-level data result in insig- nificance for both variables. Table 5 shows the results of this regression.

420 TABLE 5. REGRESSION WITH Qt AS CONTROL VARIABLE

Pt=β1*IAt+β2*Qt+β3*Dt+β5*SPt+Constant With Constant No Constant .0006096 .0006593 SP t (.000) (.000) .2742662 .4170445 IA t (.010) (.000) -.0101116 .0363141 D t (.918) (.275) .4029882 .4397591 Q t (.003) (.002) .072123 N/A Constant (.000) R2 .4487 .4975 Observations 112 112 Note: Parentheses contain P-Values.

As we can see, Dt, is completely insignificant when controlledfor

Qt. This indicates that our data are valid. We will discuss our final test in the Results section as we need to complete our primary analysis prior to completing the discussion on our final test. The test will also indicate that our data conform to our theoretical assumptions. Results We tested all regressions herein with OLS (Ordinary Least Squares) and random effects and fixed effects models, with the firm as the panel variable and observed minimal variation from the basic regression. To simplify the reading of these results, we have only included the OLS version of each regression.

Our first OLS model (Table 2) looks at Equation 3, with the addi- tion of the SP variable (percent change in S&P 500 Index) and a constant in one of the models. Data are at the project level, with each year being a data point. Table 6 shows the results of this regression.

421 TABLE 6. EQUATION 3 REGRESSION

Pt=β1*IAt+β2*Qt+β5*SPt+Constant With Constant No Constant

SPt .6468396 .7822163 (.000) (.000)

IAt .428198 .6746299 (.004) (.000)

Qt .4995067 .4489935 (.052) (.121) Constant .1106027 N/A (.000) R2 .2786 .3280 Observations 300 300 Note: Parentheses contain P-Values.

We can see that only in the model with a constant is our primary variable of

interest, Qt, significant at the 95 percent level, though the point estimate is within 20 percent in either model. Its coefficient is positive as predicted, meaning that as the government cuts quantity ordered, the firm’s value decreases as well. We can also see that an increase in the S&P 500 and the initial contract award are correlated to an increase in firm value. With our next model, we try to tease out the relationship between a change in development costs and a change in quantity costs. Equation 4 from the

model section of this article was Qt = E1*Dt. Our model in Table 7 is simply

Dt regressed against Qt to find 1E . With the assumption that the government

adjusts Qt as a reaction to Dt, we understand E1 to be the ratio of Qt changes

for a given Dt. In other words, the elasticity of Qt with respect to Dt.

422 TABLE 7. EQUATION 4 REGRESSION

Qt=E1*Dt+ Constant With Constant No Constant

Dt -.1709295 -.1699817 (.000) (.000) Constant .0024306 N/A (.103) R2 .6516 .6487

Observations 320 320

Note: Parentheses contain P-Values.

Accepting the no constant model as our primary model because the con- stant is not significant at even the 90 percent level, we can understand the coefficient of Dt to mean that for every percent that an MDAP’s development costs increase from initial estimates, the government will order about 0.17 percent less quantity. Our 95 percent confidence interval for 1E lies between -0.1838 and -0.1562.

We also tested the Pearson correlation between the two variables and found it to be 0.8072, indicating a significant inverse relationship between them.

We can also analyze the relationship between Q and D by viewing them on a scatterplot. As we can see in Figure 1, which contains project-level data, the correlation appears negative.

FIGURE 1. PROJECT-LEVEL Q VS. D SCATTERPLOT

0.2 0.15 0.1 0.05 0 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 -0.05 -0.1 -0.15 -0.2

Percentage Change in Development Costs Change in Development Percentage -0.25 -0.3 Percentage Change in Quantity Ordered

Note: Omits one outlier data point from Joint Strike Fighter program.

423 Shifting our initial model from Table 6 to have Dt replace Qt, we would

expect that the coefficient for Dt would simply be Qt coefficient multiplied

by E1 whereas all other coefficients remain the same. The OLS model for Equation 5 is depicted in Table 8, and the far right column is our theory’s prediction of the model.

TABLE 8. EQUATION 5 REGRESSION

Pt=β1*IAt+β2*E1*D1+β5*SPt + Constant With Constant No Constant Predicted .6589882 .7930062 SP .6468396 t (.000) (.000) .4268835 .676214 IA .428198 t (.004) (.000) -.1251678 -.0898093 D -0.08538 t (.022) (.142) .1121303 Constant N/A .1430959 (.000) R2 .2823 .3275 N/A

Observations 300 300 300

Note: Parentheses contain P-Values.

Whereas the no constant model has a higher R2, our constant model has all variables significant at the 95 percent level. For this reason, and because our model from Table 2 used the constant model as its primary one, we will use the constant model. In the far right, we can see our theoretical predictions for the OLS model with a constant and that all values are close to actual esti- mations (well within the 95 percent confidence interval). Our empirically

estimated coefficient for Dt is -0.125 with a 95 percent confidence interval of -0.232 to -0.018. This places our theoretical prediction well within the limits of our actual estimations and lends significant credence to the valid- ity of both our data and theory.

All regressions including a constant were tested for heteroscedasticity using the Breusch-Pagan/Cook-Weisberg test. We were not able to reject the null hypothesis of homoscedasticity for any regression. We also observed the residual plot for every regression and found no clear evidence of any specifi- cation errors. Figure 2 shows the residual plot for equation 5. The plot is not uniformly distributed, but there is no clearly identifiable pattern suggesting omitted variable bias or another specification error. Various specifications all yield qualitatively similar plots.

424 FIGURE 2. EQUATION 5 RESIDUAL PLOT

.5

0 Residuals

-.5

-.2 0 .2 .4 .6 Fitted Values

Conclusions Our theory, that even in a cost-plus contract, defense contractors’ firm value fluctuates indirectly with development cost changes through the government’s quantity cutting response, presents a clear framework for building effective incentives to mitigate potential cost overruns. Our data show that as development costs rise from initial estimates, the quantity ordered by the government decreases. Because our theory and model show that a decrease in quantity ordered leads to a lowered firm value, develop- ment cost overruns that lead to less quantity ordered should have a similar effect. The only difference in a dollar of cost overrun and a dollar cut from the final order is the ratio between the two. If the government cuts 25 cents of the final order for every dollar of development cost overruns, the harm to the firm from development cost overruns will not be as strong as a 1:1 ratio. The government is the decider of this elasticity and can therefore determine how great the disincentive is for a firm to allow development costs to climb. It is hard to imagine the government determining at the outset of a contract

425 how much quantity they will cut based on development costs, but by estab-

lishing a reputational E1, the government can effectively achieve the same

objective. E1 will simply not be as flexible from contract to contract. Our models all support the theory as we would expect. Changes in quantity ordered are positively correlated and changes in development costs are negatively correlated with changes in firm value. More convincingly, the magnitudes of these correlations are roughly the same as the ratio that the government chooses. Firm value increases about 0.51 percent per percent- age increase in end product purchases. Firm value decreases about 0.12 percent for every percentage increase in development costs. If our theory proved exactly correct, given our estimated ratio of quantity cost changes to development cost changes (~ -.17), firm value should decrease approximately 0.9 percent for every percentage increase in development costs. This value is only 25 percent away from our point estimate and is well within reason- able confidence intervals. Further, when we look at changes in development costs that occurred concurrently to changes in quantity ordered versus those that did not, our theory is further supported. Development costs with no quantity changes have no effect on firm value while those with quantity changes associated do have an effect.

If we take our theory and solve for the ratio of changes in quantity ordered -1/i 1/i to changes in development costs E1 = [D( )]*[ -1+(1/(β1*IA+1)) ]/β2, the government can determine the optimal ratio to incentivize firms as desired. If the government wants the end cost to be below a certain amount and the firm can control costs, it must create incentives such that the firm will lose value on the project if it exceeds that amount. For a $70 billion firm bidding $20 billion on a project that will last 5 years, we might create a graphic as in Figure 3.

426 FIGURE 3. DEVELOPMENT COSTS VS. STRENGTH OF FIRM PENALTIES

$60

$50

$40

$30

$20 Development Costs (Billions) Costs Development $10

$0 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0

E1

If the government does not want total cost to exceed $40 billion, it should set the ratio of quantity ordered cut for a development cost overrun at approximately -0.19. Given our estimated values and the firm’s aversion to losing value, the firm will allow development costs to only grow an accept- able amount.

We create Figure 1 by calculating E1 for all reasonable values of D1.

We can then calculate total cost for each value of E1 by adding the devel- opment cost effects and quantity-ordered effects to the initial bid. The government could also use Figure 3 to better predict final costs of

research and development. All it needs to know is E1, and it can then ascer- tain a firm’s incentives to control costs.

If the government alternatively believes that the firm knows its costs, but has no control over them, it can seek to incentivize a realistic bid.

The harsher the incentives (lower E1), the more closely the bid will reflect firm expectation of cost. If we assume a $70 billion firm that has known costs of $20 billion for development for a 5-year project, we can build Figure 4.

427 FIGURE 4. BID ACCURACY VS. STRENGTH OF FIRM PENALTIES

E1 vs Bid 120%

100%

80%

60%

40% Bid (% of Total Final Cost) Bid (% of Total 20%

0% -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0

E1

As we can see, the harsher the incentive, the closer the firm’s minimum bid gets to its actual estimate. We can rearrange our chart to give us the desired percentage of the total estimate that the government can reasonably expect all bids to at least reach.

We create Figure 4 by calculating E1 for every reasonable D1, given the firm’s

expected development costs. We then infer the bid from D1. This framework has several important implications for policy makers. If the government seeks to control costs or at least obtain an accurate estimation of the firm’s expectations of cost, it can use Equation 7 and our estimated coefficients to design an optimal contract. While this article looks at the ratio of quantity cuts to development cost overruns, we could easily calculate the profit lost from the quantity cuts and determine a more straightforward cost-sharing ratio with the same coefficients. A dollar of profit lost from quantity cuts should not impact the firm differently than a dollar of profit lost from a cost-sharing scheme.

The government might be encouraged to cancel projects at a lower threshold than is current policy. It might also be encouraged to establish firmer top- line budgets for projects, from development to production. The firm would then understand that as more funds were used in development, a smaller

428 share of the funds would go toward production. When a higher deterrent to cost overruns is established through a demonstrated willingness to cut quantity ordered, we should see more reasonable bidding, less cost overruns, and lower total cost of future projects.

Development costs with no quantity changes have no effect on firm value while those with quantity changes associated do have an effect.

With an E1 established by the government, the firm should bid in a predict- able manner, given its own expectations of the final cost of development. The government can then use the bid it receives to estimate the firm’s true cost expectations. For instance, if we know that a firm should bid 50 per-

cent of its expected cost for an established E1, then the government should budget twice what is called for in the contract bid. This insight could allow the government to more accurately forecast expenses and improve contract stability. This stability could lead to lower costs to the taxpayer.

429 References Cummins, J. M. (1977). Incentive contracting for national defense: A problem of optimal risk sharing. The Bell Journal of Economics, 8(1), 168–185. Goel, R. K. (1995). Choosing the sharing rate for incentive contracts. The American Economist, 39(2), 68–72. Hiller, J. R., & Tollison, R. D. Incentive versus cost-plus contracts in defense procurement. The Journal of Industrial Economics, 26(3), 239–248. Holthausen, R., & Leftwich, R. (1986). The effect of bond rating changes on common stock prices. Journal of Financial Economics, 17(1), 57–89. Hough, P. G. (1992). Pitfalls in calculating cost growth from Selected Acquisition Reports (Report No. N-3136-AF). Santa Monica, CA: RAND. Weitzman, M. L. (1980). Efficient incentive contracts. The Quarterly Journal of Economics, 94(4), 719–730.

430 Author Biography

LTJG Sean Lavelle, USN, is a naval flight officer serving with VP-26 on the P-8A Poseidon in Jacksonville, Florida. He holds a BS in Economics from the U.S. Naval Academy and a Master’s in Finance from Johns Hopkins University.

(E-mail address: : [email protected])

431 INFORMING POLICY through Quantification of the INTELLECTUAL PROPERTY LOCK-IN Associated with DOD ACQUISITION

Maj Christopher Berardi, USAF, Bruce Cameron, and Ed Crawley

This article highlights a general lack of quantitative understanding surrounding trends in intellectual property lock-in associated with Depart- ment of Defense (DoD) acquisition. Important questions like, “How much does lacking rights in intellectual property cost DoD each year in lost competition?” remain unanswered. The analysis herein seeks to interpret DoD contracting trends in Federal Acquisition Regulation (FAR) 6.302- 1(b)(2) exceptions to full and open competition as an indicator variable for intellectual property lock-in. By using data publicly available from USASpending.gov, this article revealed that approximately $6 billion in contracts used FAR 6.302-1(b)(2) exceptions between FY08–FY15. Further analysis identified trends in the decreasing use of FAR 6.302-1(b)(2) exceptions, but no material trend in the amount of money obligated using exceptions. A key finding was prevalent use of FAR 6.302-1(b)(2) exceptions for service contracts as well as research and development contracts. This finding countermands conventional understanding of what is acquired using 6.302-1(b)(2) exceptions.

DOI: https://doi.org/10.22594/dau.16-767.24.03 Keywords: Intellectual Property, Data Rights, Technical Data, Competition, FAR 6.302-1(b)(2) Exceptions

 Image designed by Michael Krukowski Informing Policy through Quantification of the Intellectual Property Lock-In http://www.dau.mil

First and foremost, this article endeavors to analyze Department of Defense (DoD) acquisition data to capture metrics on either the increase or decrease in use of contracts awarded using Federal Acquisition Regulation (FAR) 6.302-1(b)(2) exceptions to other than full and open competition. Such contracts are used herein as an indicator variable for intellectual property lock-in. Intellectual property lock-in is defined as any contract in which the lack of rights to intellectual property is the basis for other than full and open competition. The choice of using FAR 6.302-1(b)(2) as an indicator variable comes with limitations, which are discussed at length. However, given the data publicly available, it remains the best available vari- able. The methods utilized herein look for broad changes in DoD competition patterns with a specific focus on contracts awarded using noncompetitive procedures due to FAR 6.302-1(b)(2) exceptions. These metrics and resulting data are instrumental to inform future legislative and policy decisions.

Research Questions None of the previous literature, with the exception of Berardi, Cameron, Sturtevant, Baldwin, and Crawley (2016) as well as Liedke and Simonis (2014), endeavored to either quan- tify the magnitude of intellectual property impact on DoD acquisition or assert quantitatively whether there were any distinguishable trends. These avenues of investigation are imperative because policy mak- ers must understand the magnitude of the problem before making any cost-benefit decision about adopting or directing changes. Additionally, it is possible that recent Better Buying Power initiatives (discussed in greater detail in the next section) are having a posi- tive impact on the intellectual property trends within DoD acquisition. To address these fundamental questions, this analysis turns to the Federal Procurement Data System (FPDS), which has been used by previous authors to successfully analyze trends in DoD acquisition (Hunter et al., 2016; Hunter, Sanders, McCormick, Ellman, & Riley, 2015; Liebman & Mahoney, 2013). Although the FPDS database has limitations, it remains a sought-after data source because it contains up to 225 variables for each contract. This analysis uses two research questions to guide the investigation:

434 Defense ARJ, July 2017, Vol. 24 No. 3 : 432-467 July 2017

Research Question 1 What is the magnitude of FAR 6.302-1(b)(2) exceptions on DoD acquisition?

Research Question 2 What trends, if any, are there in FAR 6.302-1(b)(2) exceptions on DoD acquisition?

Research Question 2(a) What trends exist in the frequency of FAR 6.302-1(b) (2) J&As?

Research Question 2(b) What trends exist in the products or services acquired using FAR 6.302-1(b)(2) J&As?

Background: Policy and Statute Intellectual property regulations and statutes within the DoD are among the most central issues in current policy discussions. This is evi- denced by the appearance of intellectual property initiatives in all three versions of Better Buying Power (BBP) (Carter, 2010; Kendall, 2013, 2015). The most recent iteration, BBP 3.0, outlines three strategies that confront the issue of intellectual property in DoD acquisition.

• The first, Remove Barriers to Commercial Technology Utilization, argues that the DoD should capture private sector innovation by using commercially available tech- nologies and products, but directs further analysis of the implications on intellectual property.

• The second strategy, Increase the Productivity of Corporate Independent Research and Development (IRAD), targets the misuse of IRAD funds by defense contractors on “de mini- mis investments primarily intended to create intellectual property” (Kendall, 2015) to secure a competitive advantage in future DoD contracts.

Defense ARJ, July 2017, Vol. 24 No. 3 : 432-467 435 • The third strategy, Use Modular Open Systems Architecture to Stimulate Innovation, argues that the DoD must control relevant interfaces to ensure competitors with superior products are not occluded from competition due to intellec- tual property-restricted interfaces.

Additionally, in five1 of the last nine National Defense Authorization Acts (NDAA), Congress addressed rights to intellectual property in DoD acquisi- tion. Most recently, in the NDAA 2016, Congress directed the establishment of a Government-Industry Advisory Panel on rights in technical data for the purpose of “ensuring that such statutory and regulatory requirements are best structured to serve the interests of the taxpayers and the national defense,” which suggests further changes are imminent. However, with all the attention on statute and policy, no research is available that analyzes trends in DoD acquisition as a result of changes to intellectual property pol- icy. Basic questions such as, “How much does lacking rights in intellectual property cost the DoD each year in lost competition?” remain unanswered. The closest statistics are the annual measurements of competition levels in DoD acquisition (DoD, 2015; Hunter et al., 2016).

The 2015 version of the Performance of the Defense Acquisition System, released annually by the Office of the Under Secretary of Defense for Acquisition, Technology and Logistics (USD AT&L), argues that competi- tion is starting to increase. To substantiate this statement, the report uses a fractional measure of contracts competitively awarded by dollar amount. The most recent measures show that 58.3 percent of Fiscal Year2 14 (FY14) contracts, by dollar amount, were competitively awarded, which is up from 57 percent in FY13 (DoD, 2015). However, this methodology is sensitive to an outlier bias, where a few large contracts awarded competitively (i.e., contracts on the order of magnitude in the $100s of millions) overshadow the many smaller contracts awarded using other than full and open compe- tition. Independent analyses have also taken exception to the claim in the USD AT&L report by concluding, “the rate of effective competition (that is, competed contracts receiving at least two offers) has been largely unchanged in recent years” (Hunter et al., 2016); and “overall DoD competition rates ... have been largely steady in recent years” (Sanders, Ellman, & Cohen, 2015). Additionally, the USD AT&L analyses make no effort to break down the type of noncompetitive contracts, which makes it difficult to determine the root causes for trends in competition or to investigate trends in specific competition types.

436 However, independent analysis is possible. Contracts awarded noncom- petitively are documented using a contracting artifact referred to as a Justification and Approval, or J&A3 (Justification & Approval, 2016), which is a document released to the public when the DoD uses a procurement strategy other than full and open competition.4 The Federal Acquisition Regulation (FAR) does not enumerate all possible uses of J&As, but it does provide guidance on application of the regulation. In doing so, it provides situations in which the authority in FAR 6.302-1 may be appropriate. Of particular interest for this research is FAR 6.302-1(b)(2):

The existence of limited rights in data, patent rights, copy- rights, or secret processes; the control of basic raw material or similar circumstances make the supplies and services available from only one source (however, the mere existence of such rights or circumstances does not in and of itself justify the use of these authorities).

The research herein relies upon the example cited in FAR 6.302-1(b)(2) as an exception to full and open competition as the prime indicator of licensing issues with technical data, computer software, or intellectual property. This may include, but is not limited to: limited rights in technical data or other associated factors such as failure to have purchased a technical data pack- age, or failure to have taken delivery of a validated and verified technical data package. Although this FAR exception is an indicator of technical data licensing or intellectual property issues, it is a noisy indicator because the “similar circumstances” part of this FAR exception may apply even though the government has acquired unlimited rights to use the technical data or computer software. This is an acknowledged limitation of using FAR 6.302-1(b)(2) J&A as an indicator variable. However, to demonstrate that a FAR 6.302-1(b)(2) J&A is a reasonable indicator for technical data licensing rights or intellectual property issues, this article draws on Rogerson’s (1994) IRAD analysis technique to demonstrate a correlation between IRAD and FAR 6.302-1(b)(2) exceptions.

Independent Research and Development The rights to technical data or computer software are generally deter- mined by which party funded the development of a work or invention. These criteria are outlined in 10 U.S.C. §§ 2320 as well as 2321, and are promul- gated within the FAR and the Defense Federal Acquisition Regulation Supplement. As a general example, the contracting officer uses a funding test to determine licensing rights. If a contract is entirely publicly funded,

437 then the government should have unlimited rights to use technical data or computer software. Conversely, if a project is entirely privately funded, then the government should be entitled to limited rights or restricted rights. If the contract is funded using a mix of public and private funding, then the government should be entitled to government purpose rights. Table 1 is an example to demonstrate the general process, but does not cover all product types or exceptions to policy.

TABLE 1. DEFENSE RESEARCH & DEVELOPMENT EXPENDITURES BY PERFORMER AND FUNDING SOURCE (FY84–FY14: ALL FY15 CONSTANT DOLLARS)

Performer Funding Source Sum % of Total

DoD DoD $523.4B 23.6%

Universities DoD (Contract R&D) $102.4B 4.6%

Nonprofit Firms DoD (Contract R&D) $47.8B 2.2%

For-Profit Firms DoD (Contract R&D) $1,348.9B 60.9%

For-Profit Firms DoD (IRAD) $93.1B 4.2%

For-Profit Firms For-Profit Firms (IRAD) $100.6B 4.5%

Source: All values except for the two IRAD values are from National Science Foundation’s (2016) WebCASPAR database and are obligations. The two IRAD numbers are from the Defense Contract Audit Agency (2016) and are incurred costs.

Assuming program offices and contractors correctly follow the funding test for determination of rights, it is reasonable to expect that the level of pri- vate IRAD funding should correlate to the level of FAR 6.302-1(b)(2) J&As. That is, if FAR 6.302-1(b)(2) J&As are a reasonable indicator of intellec- tual property issues, an increase in private funding should correlate to an increase in FAR 6.302-1(b)(2) J&A. To explore this potential correlation, Table 1 contains the sum of research and development (R&D) funding within the DoD by funding source and performer from FY84–FY14. Most of the combinations are self-explanatory, with the exception of “DoD (IRAD).” This is the proportion of For-Profit IRAD that is an allowable cost to the DoD, typically bundled in a firm’s overhead rate; whereas, the For-Profit Firm IRAD is the proportion of IRAD that was incurred, but was not allow- able as an indirect cost to the DoD. Figures 1a and 1b offer a visualization of the changes in IRAD spending from FY84-FY15.

438 FIGURE 1A. PERCENT OF TOTAL DOD R&D SPENDING BY PERFORMER AND/OR FUNDING SOURCE

100% Public IR&D Nonprofit Private IR&D Federal 80% Universities For-Profit

60%

40%

20%

0%

1985 1990 1995 2000 2005 2010

FIGURE 1B. IR&D SPENDING TRENDS FROM FY84-FY14 (CONSTANT FY15 DOLLARS)

12% Public IR&D 10% Private IR&D

8%

FIGURE 1A. PERCENT6% OF TOTAL DOD R&D SPENDING BY PERFORMER AND/OR FUNDING SOURCE

4%

2%

0%

1985 1990 1995 2000 2005 2010

Note. Public IRAD refers to “DoD (IRAD)” in Table 1 and Private IRAD refers to “For Profit Firms (IRAD)” in Table 1. (Source: Defense Contract Audit Agency, 2016)

439

FIGURE 1B. IR&D SPENDING TRENDS FROM FY84-FY14 (CONSTANT FY15 DOLLARS

Note. Public IRAD refers to "DoD (IRAD)" in Table 1 and Private IRAD refers to "For Pro t Firms (IRAD)" in Table 1. (Source: Defense Contract Audit Agency, 2016) It is likely that IRAD funds invested in a fiscal year would not impact FAR 6.302-1(b)(2) exceptions until, at the very least, the following fiscal year. To explore this potential relationship, a lagging correlation analysis is used. These results are outlined in Figures 2a and 2b, which show lagging Pearson product-moment correlations between both sum FAR 6.302-1(b)(2) J&A obligations (Figure 2a), and mean FAR 6.302-1(b)(2) J&A obligations (Figure 2b), to Private IRAD. Figures 2a and 2b illustrate which 7-year period of Private IRAD is the highest correlated with the Patent or Data Rights (PDR) J&A obligations from FY08–FY14. Clearly, this analysis indi- cates little correlation between the sum FAR 6.302-1(b)(2) J&A obligations and Private IRAD; however, a relatively strong positive correlation exists between mean annual FAR 6.302-1(b)(2) J&A obligations and Private IRAD. Although, this analysis does substantiate a correlation between private funding and FAR 6.302-1(b)(2) J&As, it is critical to point out that this does not suggest causality between the two variables—merely that there is a connection between these two variables; and while FAR 6.302-1(b)(2) contains provisions other than intellectual property, it is still a reasonable, albeit noisy indicator of intellectual property issues.

FIGURE 2A. LAGGING PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENTS

1

0 Correlation Coecient Correlation

–1 0 2 4 6 8 Lag (years)

Note. Sum FAR 6.3021(b)(2) J&A obligations’ correlation with Private IRAD at lags 0 < n < 10.

440

FIGURE 2A. LAGGING PEARSON PRODUCT MOMENT CORRELATION COEFFICIENTS

Note. Sum FAR 6.3021(b)(2) J&A obligations' correlation with Private IRAD at lags 0 < n < 10. FIGURE 2B. LAGGING PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENTS 1

0 Correlation Coecient Correlation

–1 0 2 4 6 8 Lag (years)

Note. Mean FAR 6.302-1(b)(2) J&A obligations correlation with Private IRAD at lags 0 < n < 10.

Method This article uses the Cross-Industry Standard Model for Data Mining (CRISP-DM) model to investigate the impact of FAR 6.302-1(b)(2) excep- tions in DoD acquisition (Shearer, 2000). Although designed primarily for data mining and machine learning applications, this model offers a struc- tured approach for the analysis of large data sets. The CRISP-DM model is broken down into six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. However, to keep the discussion in this article parsimonious, the last two phases are omitted.

Phase One: Business Understanding FIGURE 2B.The LAGGING data available PEARSON in FPDS PRODUCT-MOMENT represent all prime CORRELATION unclassified federal COEFFICIENTS gov- ernment contract awards above the micropurchase threshold. Consequently, Note. Meanthe data FAR in 6.302-1(b)(2) FPDS come with J&A theobligations' following limitations:correlation with Private IRAD at lags 0 < n < 10. 1. FPDS includes only data on prime contract awards. A sep- arate database (i.e., Federal Subaward Reporting System) tracks subcontract awards, but has historically been incom- plete (Moore, Grammich, & Mele, 2015).

441 2. FPDS reporting regulations require the disclosure of only unclassified contracts. The magnitude of spending this regulation omits from the overall FPDS numbers is unknown; some estimate it at as high as 10 percent (Center for Strategic & International Studies, 2017).

3. FPDS reporting regulations require disclosure of contracts above the micropurchase threshold, which is defined in FAR subpart 2.1 as $3,500 (although at the time of writing, FPDS still uses the past threshold of $3,000). Contracts below this amount are not captured in FPDS.

As with any publicly available database, any analysis is wholly dependent on the quality of the underlying data. Previous Government Accountability Office (GAO) studies commented on the inaccuracies and problems within FPDS (GAO, 2010), but those reports are over 10 years old and reflect, in part, problems with the migration of government acquisition data into the FPDS system (Center for Strategic & International Studies [CSIS], 2017; General Accounting Office, 2003). However, despite the limitations noted above, FPDS is one of the few publicly available exhaustive sources of data on government acquisition and is suitable for analyses aimed at identifying trends or making order-of-magnitude comparisons.

Phase Two: Data Understanding This phase of the CRISP-DM process begins with the collection of data and then transitions to identifying data quality problems. The goal of this phase is to build an understanding of the limitations in the data as well as to build an understanding of quality issues that may need to be remedied before the application of analytic models. This process concludes with a verification of data quality.

Two primary avenues allow users to access data from FPDS. The first is to download XML archives directly from the FPDS website by fiscal year. The second method is USASpending.gov, which is designed to aggregate federal government spending data and display it to taxpayers. USASpending.gov was ultimately selected and used for all the subsequent analyses herein.5 The data in USASpending.gov are available from FY00 through the current fiscal year. However, an inspection of fiscal years prior to 2008 revealed many missing data points and subsequently a lack of the necessary fidelity for inclusion in these analyses. In addition, the most recent fiscal year (FY16 at the time of writing) is excluded from this analysis because the DoD data in USASpending.gov are intentionally delayed by 90 days for national secu- rity reasons; and an additional 30- to 60-day delay is incurred to update the

442 data in FPDS after the initial 90 days, amounting to a 150-day lag in current fiscal year data. Subsequently, the analyses that follow will evaluate only fiscal years 2008–2015.

Across 8 fiscal years, over 12 million contract actions are executed within the DoD. For purposes of this analysis, all zero-dollar contract obligations (i.e., a contract action that neither adds nor subtracts funds) were removed from the data set. Zero-dollar obligations are used oftentimes to make administrative changes or no-cost changes to the contract; consequently, leaving them in the data set would skew later analyses that look at ratios of contract awards. This removal has no material impact on the overall dollars obligated, but does adjust the count of contracts across all fiscal years under study. After removing these contract actions, the remaining data set contains over 11 million contract actions totaling approximately $2.7 trillion (Table 2).

TABLE 2. DESCRIPTION OF DATA

Service Sum Count

DEPT OF THE ARMY $908,943,882,349 2,128,614

DEPT OF THE NAVY $741,787,602,050 1,797,791

DEPT OF THE AIR FORCE $496,212,601,573 836,929

OTHER DoD $311,293,701,289 562,164

DEFENSE LOGISTICS AGENCY $284,921,110,343 6,036,863

TOTAL $2,743,158,897,605 11,362,361

Phase Three: Data Preparation Phase Three involves selecting, cleaning, constructing, integrating, and formatting data into the final data set before application of selected model- ing tools. All figures and tables displayed before this point are representative of the organic data with no data preparation applied.

Data selection. From the 225 available variables, only a small sub-sample is required to address the research questions.6 In total, 11 vari- ables were selected (Table 3), including data descriptions quoted directly from the USASpending.gov data dictionary (USASpending.gov, 2015). One could make an argument for including additional fields of data in the analyses, but computational memory limitations require a parsimonious set of variables for a data set in the tens of millions order of magnitude.

443 TABLE 3. DESCRIPTION OF DATA FIELDS

Field Name Description

descriptionofcontractreq A brief description of the goods or services bought (for an award) or that are available.

dollarsobligated The net dollar amount that is obligated or de- obligated by this transaction. If the net is a de- obligation, the amount will be negative.

fiscal year The federal government fiscal year, determined by the “Signed Date” provided by FPDS.

fundedbyforeignentity Indicates that a foreign government, international organization, or foreign military organization bears some of the cost.

maj_agency_cat The combination of two leftmost characters of the contracting agency code representing major federal organizations and departments, and their description.

modnumber An identifier that uniquely identifies a modification for a contract, agreement, or order.

principalnaicscode The principal North American Industry Classification System, or NAICS code that indicates the industry in which the contractor does business.

productorservicecode The code that best identifies the product or service procured. If more than one code applies, then the code that represents most of the ultimate contract value is reported.

psc cat The major category that the Federal Procurement Data System Product or Service Code for the record falls within.

reasonnotcompeted A code for the reason the contract was not competed – i.e., solicitation procedures other than full and open competition pursuant to FAR 6.3.

signeddate The date that a mutually binding agreement was reached. The date signed by the Contracting Officer or the Contractor, whichever is later.

Data cleaning. In 2015, Defense Logistics Agency (DLA) contract actions more than quadrupled over the previous fiscal year, while the total dollars obligated slightly decreased. This aberration is explained by a change in the way DLA accounts for contracts under the micropurchase threshold. To control for this change, the data for all DLA obligations in FY15 were cleaned to only those contract obligations that were over or under the micropurchase threshold (i.e., x < -$3,000 as well as x > $3,000). This cleaning step brings the number of DLA obligations in FY15 from 2,396,568 to 589,474, and is comparable to previous fiscal years.

444 Although this precipitous drop in obligations—a -75.4 percent change—may seem like a large impact, it ultimately reduced the total DLA obligations in 2015 by $967,630,808, or only -3.15 percent.

Finally, to enable comparisons across multiple fiscal years thedollarsob - ligated field was converted from nominal dollars to constant FY15 dollars using the annual average All-items Consumer Price Index for the respective fiscal years under study. All expressions of dollars obligated or total obliga- tions from this point forward are constant FY15 dollars unless otherwise noted in the analyses.

Data construction. From the selected 11 variables, an additional 2 were constructed. Psc_group and psc_simple data fields were constructed to mirror the CSIS taxonomy of Product Service Codes7 (PSC). Table 4 depicts 10 product categories and 5 service categories into which hundreds of PSC codes are aggregated, plus a category for R&D8 (Ellman & Bell, 2015).

The psc_group was created by mapping the productorservicecode field to the second level of the CSIS PSC taxonomy (i.e., Aircraft, Ships, Fuels, PAMS, etc.) and the psc_simple field was created by mapping theproductorservice - code field to the first level of the taxonomy (i.e., Product, Service, or R&D). These additional fields enable categorical analyses that would not be pos- sible using only the data available in productorservicecode.

445 TABLE 4. PRODUCT SERVICE CODE CATEGORIES

Product Categories

1. Aircraft 6. Electronics & Communications (E&C)

2. Clothing & Subsistence (C&S) 7. Engines & Power Plants (E&PP)

3. Fuels 8. Ground Vehicles

4. Launchers & Munitions (L&M) 9. Missiles & Space

5 Ships 10. Other Products

Service Categories

1. Equipment-related Services (ERS)

2. Facilities-related Services & Construction (FRS&C)

3. Information and Communications Technology (ICT) services

4. Medical (MED) services

5. Professional, Administrative, and Management Support (PAMS)

Research & Development (R&D)

Phase Four: Modeling As discussed in earlier phases, characteristics found in FPDS data and subsequently USASpending.gov data may limit analyses aimed at identifi- cation of trends and order-of-magnitude comparisons. Consequently, Phase Four consists primarily of complex Boolean logic queries, visual modeling techniques, and regression models to answer the research questions set out at the beginning of this analysis.

Analysis of research question 1. This inquiry is most easily addressed by analyzing the reasonnotcompeted data field. In the most recent version of the FPDS data dictionary, the field “Other Than Full and Open Competition” is used as the designator for solicitation procedures other than full and open competition pursuant to FAR 6.3 or FAR 13 (USASpending.gov reti- tled this field as “Reason Not Competed”). This field provides 17 different codes, which correspond to FAR 6.301 or FAR 13. FPDS and subsequently USASpending.gov use the code “PDR” or the short description “Patent or Data Rights” to report a contract action justified pursuant to FAR 6.302-1(b) (2). Unfortunately, no information is provided on the specific justifica- tion—only the reference to FAR 6.302-1(b)(2). This makes segregating out the different exceptions in FAR 6.302-1(b)(2), for other than full and open

446 competition, impossible. Although reasonnotcompeted denotes the FAR exception used in the J&A, it is important to note that the unit of analysis here is the contract action authorized by the J&A, not the J&A itself. To ana- lyze the data, each contract action was organized into a group based on its reasonnotcompeted category, with null values omitted. Within the resulting groups, the dollarsobligated field is used to quantify the magnitude of the effect. The results are captured in Table 5, in which the Patent/Data Rights group is highlighted. This category denotes FAR 6.302-1(b)(2) exceptions.

This analysis suggests that FAR 6.302-1(b)(2) exceptions occluded over $6 billion in DoD contracts9 from full and open competition between FY08-FY15. However, this number may be misleading because of the way FPDS collects data. For each contract, an administrator is only permitted to select one field for reasonnotcompeted from the list of values outlined in Table 5. In practice, however, a J&A may contain more than one exclusion category, forcing the adminis- trator to make a judgment call on which category is most accurate. This nuance may explain the large number of “ONLY ONE SOURCE— OTHER” contract actions. Typically, a J&A for a con- tract would be “ONLY ONE SOURCE—OTHER” because of “UTILITIES” or “PATENTS/ DATA RIGHTS.” This is because “PATENTS/DATA RIGHTS,” out- lined in FAR 6.302-1(b)(2), is a subsection of “ONLY ONE SOURCE—OTHER,” as outlined in FAR 6.302-1. Therefore, when admin- istrators are faced with selecting a single category that most accurately represents why an action was not sub- ject to full and open competition, they may select “ONLY ONE SOURCE—OTHER.” As a result, the aggregation of FAR 6.302-1(b) (2) exceptions in Table 5 are most likely only a lower bound of the true impact on DoD acquisition.

447 TABLE 5. COUNT, SUM, AND MEAN OF dollarsobligated GROUPED BY reasonnotcompeted, GROUPS SORTED BY SUM

Reason Not Competed Count Sum Mean

ONLY ONE SOURCE—OTHER 807,717 $600,738,669,601 $743,749

UNIQUE SOURCE 200,842 $209,486,641,145 $1,043,042

AUTHORIZED BY STATUTE 397,234 $99,108,891,481 $249,498

FOLLOW-ON CONTRACT 98,742 $74,653,183,136 $756,043

MOBILIZATION, ESSENTIAL R&D 34,166 $62,704,044,715 $1,835,276

INTERNATIONAL AGREEMENT 17,478 $61,851,393,409 $3,538,814

NATIONAL SECURITY 18,904 $31,207,522,647 $1,650,842

URGENCY 56,299 $30,980,330,505 $550,282

AUTHORIZED RESALE 39,693 $26,421,588,758 $665,649

SAP NON-COMPETITION 270,884 $8,763,887,633 $32,353

PATENT/DATA RIGHTS 13,150 $6,338,305,329 $482,000

UTILITIES 11,645 $4,830,634,587 $414,825

BRAND NAME DESCRIPTION 8,254 $1,824,756,424 $221,075

PUBLIC INTEREST 3,910 $1,821,008,661 $465,731

STANDARDIZATION 3691 $197,856,965 $53,605

UNSOLICITED RESEARCH 294 $144,568,046 $491,728

8AN 79 $49,346,244 $624,636

Note. SAP = Simplified Acquisition Threshold.

Analysis of research question 2a. The second question sought to identify trends, if any, in the use of FAR 6.302-1(b)(2) exceptions in DoD acquisition. To address this question, the data set was converted into a time

448 series using the signeddate as a date-time index. With the data set in this format, parallel lines of inquiry were used to identify two trends: ratios of dollars obligated and ratios of contracts awarded. For example, to determine ratio of dollars obligated, the data set was down-sampled into the 96 months that span fiscal year 2008 to 2015; Equation 1 sums the dollars obligated for each month from both the subset (α) and full set (β). n ∑ 1 α n α : β = month month n ∑ β n where: 1 (1) α ⊂ β, β≠0, n = days in a given month

The result is two distinct time series of data. The first is of dollars obligated by month. The second is a count of contract awards, using only initial con- tract awards and not subsequent modifications, by month. Leaving contract modification data in the dataset would overstate the number of sole source contract awards supported by a FAR 6.302-1(b)(2) exception, since a J&A is typically not issued for each modificaiton. Using this time series trans- formation process, six different ratios were calculated:

1. The ratio of all contracts using a J&A to total number of contracts. Measured in terms of dollars obligated by contract and count of contracts (referred to henceforth as J&A to Total Awards).

2. The ratio of only contracts using FAR 6.302-1(b)(2) J&A to total number of awards. Measured in terms of dollars obligated by contract and count of contracts (referred to using the FPDS designator of “PDR” or Patent or Data Rights, which is henceforth referred to as PDR to Total Awards).

3. The ratio of contracts using a FAR 6.302-1(b)(2) J&A to contracts using any J&A. Measured in terms of dollars obligated by contract and count of contracts (referred to using the FPDS designator of “PDR” or Patent or Data Rights, which is henceforth referred to as PDR to J&A).

These two lines of inquiry illustrate whether the frequency of contracts awarded using a FAR 6.302-1(b)(2) exception is increasing or decreasing over the fiscal years under study. The results of this analysis are found in

449 Figure 3a and Figure 3b, which organize each of the ratios into a row with a column for ratio of obligation amounts (left column, denoted with a $ in the legend) and ratio of contract actions (right column). The red line on each graph in Figure 3a shows a 12-month exponentially weighted moving average (EWMA) for each time series ratio, which removes some of the irregular variations. Additionally, to assist in quantifying the trend of data displayed in Figure 3a, Figure 3b illustrates an ordinary least squares (OLS) linear regression of the data. To compute this trend line, the dates on the x-axis were converted into numerical months ({Jan08,Feb08…Dec15}={1,2,…96}). Consequently, the x-axis in Figure 3b now corresponds to the number of months elapsed since the start of FY08. All other formatting and labeling factors remain constant between Figure 3a and Figure 3b.

450

2015

PDR to Awds PDR to EWMA (12-Month) JnA PDR to EWMA (12-Month) 2014 JnA to Awds JnA to EWMA (12-Month)

2013

2012

2011

2010

2009 2008 12% 18% 14% 16% 22% 24% 26% 20% 0.1% 1.2%

1.0% 0.2% 0.8% 0.2% 0.6% 0.4% 0.0% 0.0%

0.15% 0.25% 0.05%

2015

2014

JnA to Awds ($) Awds JnA to EWMA (12-Month) 2013

2012

2011

2010

PDR to Awds ($) Awds PDR to EWMA (12-Month) 2009 PDR to JnA ($) PDR to EWMA (12-Month) 2008

70% 30%

20% 50% 80% 40% 60% 90% 1.2% 1.0% 1.5% 0.8% 0.2% 0.6% 0.4% FIGURE 3A. OF RATIOS OBLIGATIONS (LEFT) AND OF RATIO CONTRACT (RIGHT) AWARDS 0.0%

2.5% 3.0% 0.5% 2.0% 0.0% JnA to Total Awds Total to JnA Awds Total to PDR JnA Total to PDR

451 84 96 +0.223 =0.332 =0.576 =0.662 2 2 2 r r r +0.00875 +0.00184 x 72 x –5 –5 1.5 e 60 0.000575 x 6.53 e – – – = = = y y y 48 36 Time (months) 24 0 12 12% 18% 14% 16% 22% 24% 26% 20% 0.1% 1.2% 1.0% 0.2% 0.8% 0.2% 0.6% 0.4% 0.0% 0.0% 0.15% 0.25% 0.05% 84 96 =0.031 =0.041 =0.001 +0.426 2 2 2 r x r r +0.00401 +0.00161 –5 x x 72 –5 –5 60 =8.17 e y =2.54 e =1.11 e y y 48 36 Time (months) 24 FIGURE 3B. LINEAR TREND LINES FOR EACH IN RATIO FIGURE 3A 0 12

70% 30%

20% 50% 80% 40% 60% 90% 1.2% 1.0% 1.5% 0.8% 0.2% 0.6% 0.4% 0.0% 2.5% 3.0% 0.5% 2.0%

0.0%

JnA to Total Awds Total to JnA Awds Total to PDR JnA Total to PDR

452 Beginning with an analysis of Figure 4a, nearly all the ratios calculated from obligations indicate almost no trend in the data, with a possible exception of PDR to J&A. This suggests that, although the data are noisy, there is no meaningful change in the number of dollars obligated in any of the ratios over the past 8 fiscal years. Conversely, the ratios calculated from contract awards show a relative decline through the start of FY15, with a slight uptick in FY15. In summary, although the ratio of contract actions using J&As and FAR 6.302-1(b)(2) J&As declined over the past 8 fiscal years, a meaningful increase or decrease in the ratio of dollars obligated using FAR 6.302-1(b) (2) J&A was not noted.

FIGURE 4A. BOXPLOT OF INTERYEAR FAR 6.302-1(B)(2) OBLIGATION TRENDS

109

108

107

106

Obligated Amount ($) Amount Obligated 105

104

103 2008 2009 2010 2011 2012 2013 2014 2015

Transitioning to Figure 4b, all the OLS regressions for the ratios of dollars obligated are slightly positive. However, PDR to J&A and J&A to Total Awards has a confidence interval that spans zero (95 percent confidence interval illustrated in Figure 4b by the shaded area on either side of the regression line). This analysis implies it is possible, although not necessarily probable, that theFIGURE trend 4A. BOXPLOT could OF be INTERYEAR positive FAR 6.302-1(B)(2) or negative. OBLIGATION Conversely, TRENDS the regres- sion results computed using the ratio of contract actions all have negative Note. Red square denotes mean. coefficients with confidence intervals that do not span zero. As a result, one can reasonably conclude that the frequency of both J&A relative to total awards, and PDR relative to both total J&As and total awards, declined from FY08 to FY15. This analysis suggests that some measures of trends in FAR 6.302-1(b)(2) use in DoD acquisition indicate a downward trend, or improvement; however, the results are confounded by other measures, which show no meaningful trend.

453 FIGURE 4B. LINE PLOT OF INTERYEAR FAR 6.302-1(B)(2) OBLIGATION TRENDS

$1.1B 1,800 Frequency (Secondary y) 1,600 $1.0B 1,400

$0.9B 1,200

$0.8B 1,000 800 $0.7B Total Obligations 600 Average Obligation $0.6B 400 2008 2009 2010 2011 2012 2013 2014 2015

To investigate further, Figure 5a and Figure 5b look at the interyear pat- terns in FAR 6.302-1(b)(2) J&A use. Specifically, Figure 5a shows the change in distribution of obligations by contracts using FAR 6.302-1(b) (2) J&A between FY08-FY15; with few exceptions, both the median and mean annual obligation amount increased year-over-year. Note the log scale in Figure 5a and the number of outliers in the boxplot with high order-of-magnitude obligations. Compare these results with those in Figure 5b, where the number of contracts is decreasing, but the annual obligation amounts trend erratically. These findings are important because they show that, although trends in FAR 6.302-1(b)(2) J&A use may be decreasing, FAR 6.302-1(b)(2) J&A contract obligations measured by either median or mean value are increasing.

454 FIGURE 5A. COUNT OF FAR 6.302-1(B)(2) J&A BY FISCAL YEAR AND PSC TAXONOMY LEVEL 1

1400 Products 1200 R&D Services 1000

800

600

400

200

0 2008 2009 2010 2011 2012 2013 2014 2015

FIGURE 5B. MEAN OBLIGATION AMOUNT OF FAR 6.302-1(B)(2) J&A BY FISCAL YEAR AND PSC TAXONOMY LEVEL 1

$2.0M Products R&D Services

$1.5M

$1.0M

$0.5M

$0.0M 2008 2009 2010 2011 2012 2013 2014 2015

455 Analysis of research question 2b. In addition to time series trends, it is important to look at “what” FAR 6.302-1(b)(2) J&As are buying to identify purchasing trends. To address this question, analyses focus on four of the data fields;psc_simple, psc_groups, principalnaicscode, and productorser- vicecode. Using these fields, it is possible to categorize each contract action as either a product or a service and attain a relative sense of the domain. This is done by employing a taxonomy that aggregates individual PSCs into broader baskets of goods. Table 6 outlines the sum obligations for the CSIS taxonomy at level 1 and level 2, as well as top 10 PSC and North American Industry Classification System (NAICS) codes for FAR 6.302-1(b)(2) J&As.

Analysis of the PSC codes reveals that FAR 6.302-1(b)(2) J&As are just as commonly used for services as they are for products. This is evidenced not only by the PSC taxonomy level 1 results, but also by all the PSC codes in Table 6 that begin with a letter. This same phenomenon is evident in the top 10 NAICS codes as well, where the 54-series category indicates a service. The two largest categories from level 2 of the PSC taxonomy are Professional, Administrative, and Management Support (PAMS), which is a service category; and Aircraft, which is a product category. Some of the service codes, “J016” for example, are logical given they denote the maintenance and repair of aircraft components. In this case, one could speculate the repair and manufacturing process of an aircraft component is likely protected by a manufacturer’s intel- lectual property. However, less clear

456 is what intellectual property is the limiting factor in the codes that make up PAMS, for example R425, which is a generic code for engineering and support services.

The data in USASpending.gov do not offer any insight into why a FAR 6.302- 1(b)(2) exception was used for any given contract. However, they do offer insight into product, service, and R&D trends (see Figure 5a and Figure 5b). In all except FY08 and FY09, more service than product contracts were authorized under a FAR 6.302-1(b)(2) J&A. Since FY09, the number of product FAR 6.302-1(b)(2) authorized contracts fell from a high of 1,204 per year to a low of 306 in FY15—a 75 percent drop. Although the drop seems precipitous, Figure 5b shows that the average product FAR 6.302-1(b)(2) J&A contract obligation amount increased since FY08. Consequently, one can reasonably conclude the DoD is using less product FAR 6.302-1(b)(2) J&As in contracts, but the average cost of each is rising. A similar conclusion can be reached for service FAR 6.302-1(b)(2) J&As, with the exception of the percentage decrease in frequency and increase in mean obligated dollars, which are of a lower magnitude. The exception is FAR 6.302-1(b)(2) J&A used for R&D, in which both the frequency and mean obligation amount increased over the past 8 fiscal years.

457 TABLE 6. PSC TAXONOMY LEVEL 1 AND TOP 10 PSC TAXONOMY LEVEL 2 PSC AND NAICS CODES SORTED BY SUM OBLIGATED DOLLAR AMOUNT FOR PDR J&A

PSC Taxonomy Level 1 Count Sum

Products 5,533 $3,654,163,121

Services 7,098 $2,269,110,084

R&D 519 $415,032,124 PSC Taxonomy Level 2 Count Sum PAMS 2,159 $1,593,098,494

Aircraft 1,449 $1,389,506,928

Missiles & Space 135 $693,343,385

Electronics & Communications 1,592 $632,919,208

ERS 4,359 $590,987,877

R&D 519 $415,032,124

Engines & Power Plants 452 $272,723,325

Launchers & Munitions 194 $269,714,770

Other 1,333 $167,492,024

Ground Vehicles 104 $125,907,354 PSC Code Count Sum

R414: Systems Engineering Services 672 $934,139,957

1410: Guided Missiles 62 $651,325,023

1510: Aircraft, Fixed Wing 220 $509,709,584

R425: Support -Professional Engineering 557 $339,143,920

J016: Maint/Repair/Rebuild of Equip 476 $215,031,480

1560 Airframe Structural Components 96 $199,453,727

1680: Misc. Aircraft Accessories 86 $161,457,768

1005: Guns Through 30 MM 47 $145,990,07

J016: Maint/Repair of Aircraft Components 356 $117,614,327

2510: Vehicle Cab Body Frame Structural 9 $117,345,238 NAICS Count Sum

336413: Other Aircraft Parts and Auxiliary 3,399 $1,311,566,685

336411: Aircraft Manufacturing 429 $1,084,203,683

541330: Engineering Services 1,044 $1,082,087,325

336414: Guided Missile and Space Vehi 62 $684,330,795

336412: Aircraft Engine and Engine Par 496 $226,071,643

334511: Search, Detection, Navigation 222 $208,272,040

541712: Research and Development in the 256 $203,694,628

332994: Small Arms, Ordnance, and Ord 57 $140,785,462

541512: Computer Systems Design Services 221 $139,646,028

541710: Research and Development in the 204 $133,528,895

Note. Above descriptions truncated to first 40 characters.

458 Future Research Beginning with Research Question 1, magnitude of impact, the anal- yses outlined in this article established a lower bound for the impact of FAR 6.302-1(b)(2) exceptions at approximately $6 billion for the fiscal years under study. This number does set an order of magnitude for the impact, but as discussed earlier, it falls short because of the way FPDS ingests data and because of the “similar circumstances” provision included in FAR 6.302-1(b) (2) exceptions. The only way to reach a potentially more accurate estimate of the magnitude would be to take a random sample of J&As—approximately 100–200—from all exceptions to full and open categories and retrieve the soft copy J&As from the Governmentwide Point of Entry (GPE).10 After retrieval of all soft copy J&As in the random sample, each would need to be hand-coded to identify reason(s) not competed. Once all of the random sam- ples were coded, the total number of J&As could be resampled according to the distribution of coded J&As. Although this method may generate a more accurate bound, there would be no way to validate the new bound, and the time required to hand-code each J&A in the random sample is prohibitive. Future research is needed to identify a more accurate way to categorize exceptions to full and open competition at the contract level in FPDS.

Future research is needed to identify a more accurate way to categorize exceptions to full and open competition at the contract level in FPDS.

This analysis was equally concerned with what was being purchased using FAR 6.302-1(b)(2) J&As. One of the key findings was the higher proportion of service contracts to product contracts in 6 of the 8 fiscal years examined. This finding is significant because FAR 6.302-1(b)(2) J&As are convention- ally considered a byproduct of intellectual property protections on physical design process artifacts (i.e., drawings, blueprints, technical data, etc.). Given the results, which counter this conventional understanding, it sug- gests more policy focus should be given to the numerous service contracts that are excluded from full and open competition due to FAR 6.302-1(b)(2) exceptions. Additional research into the justification for these artifacts of FAR 6.302-1(b)(2) is needed to inform future policy decisions.

459 Lastly, analyses identified an increasing trend, both in terms of frequency and mean obligations, in the use of FAR 6.302-1(b)(2) exceptions for R&D. This is also an unexpected finding since the very nature of R&D suggests investigation of a novel product. Given the nature of R&D, it is difficult to understand how a contract could be authorized by FAR 6.302-1(b)(2) exceptions. These R&D contracts could be evidence of the “similar circum- stances” section in FAR 6.302-1(b)(2), evidence of an improper use of FAR 6.302-1(b)(2), evidence of a novel use of FAR 6.302-1(b)(2), or evidence of an error in the PSC taxonomy. Regardless, more investigation and research are needed into this finding.

Limitations/Recommendations The limitations previously outlined are not only confined to magni- tude-of-impact estimates, but also to trend analyses. It is impossible to say that the trends discussed previously are, with any certainty, representative of the entire effect. These analyses must acknowledge the possibility that the trends previously discussed may not be a truly accurate representation of the actual trends in DoD acquisition. This difficulty in examining even the simplest questions in DoD acquisition is the most salient finding of this research. Trends are necessary to inform multibillion dollar DoD policy decisions, but gathering even the simplest trends results in a task mired by a complex web of policy and data interactions, confounded by limitations in publicly available databases and unnecessarily broad FAR language. Making a few changes in the way DoD contract data are collected should bring the community a step closer to much richer analyses and more gen- eralizable results:

• Allow for more than one FAR exception to full and open competition in FPDS per contract. This change to the data structure in FPDS should help more accurately define which exceptions are used in each J&A and permit more accurate analyses of trends in noncompetitive contracting types.

• Define or remove “similar circumstances” from FAR 6.302-1(b)(2). This phrasing in the FAR only adds ambi- guity to a section that predominantly concerns intellectual property and technical data. Furthermore, it is unnecessar- ily redundant given the qualifying language already in the FAR: “Use of this authority may be appropriate in situations

460 such as the following (these examples are not intended to be all-inclusive and do not constitute authority in and of themselves) (emphasis added).”

If FAR 6.301-1(b) is not meant to be all-inclusive, what is the necessity for a catchall statement in FAR 6.301-1(b)(2)? Removal of this language still maintains the purpose of FAR 6.302-1(b)(2) given the language in FAR 6.302-1(b). Conclusions This article set out to capture metrics that would yield a better under- standing of limitations in DoD acquisition due to intellectual property lock-in. In doing so, the analyses established a lower bound on the magnitude of impact at $6 billion and noted trends of decreasing FAR 6.301-1(b)(2) use, but no material trend in the amount of money obligated in contracts using a FAR 6.301-1(b)(2) exception. Furthermore, the prevalent use of FAR 6.301- 1(b)(2) exceptions for service contracts was a key finding. Service contracts using FAR 6.301-1(b)(2) exceptions, measured in terms of frequency, were more common than product contracts; however, product contracts using FAR 6.301-1(b)(2) exceptions, in terms of dollars obligated, outpaced service contracts. This finding is salient because it countermands the conventional understanding of what types of contracts are authorized using a FAR 6.301- 1(b)(2) exception to full and open competition.

461 References Berardi, C., Cameron, B., Sturtevant, D., Baldwin, C., & Crawley, E. (2016). Architecting out software intellectual property lock-in: A method to advance the efficacy of BBP. Proceedings of the 13th Annual Acquisition Research Symposium (pp. 184–201). Monterey, CA. Carter, A. B. (2010). Better buying power: Mandate for restoring affordability and productivity in defense spending. Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Center for Strategic & International Studies. (2017). Methodology: The techniques, assumptions, and limitations underlying the analysis of defense contracting trends. Retrieved from https://www.csis.org/programs/international-security- program/defense-industrial-initiatives-group/methodology Department of Defense. (2015). Performance of the Defense Acquisition System, 2015 annual report. Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Ellman, J., & Bell, J. (2015). Analysis of defense products contract trends, 1990-2014 (Vol. HQ0034-12-). Retrieved from http://csis.org/publication/analysis-defense- products-contract-trends-1990-2014 General Accounting Office. (2003). Reliability of federal procurement data (Report No. GAO-04-295R). Washington, DC: U.S. Government Printing Office. Government Accountability Office. (2010). Opportunities exist to increase competition and assess reasons when only one offer is received (Report No. GAO-10-833). Washington, DC: U.S. Government Printing Office. Hunter, A. P., McCormick, R., Ellman, J., Sanders, G., Johnson, K., & Coll, G. (2016). Defense acquisition trends, 2015. CSIS Series on Strategy, Budget, Forces, and Acquisition. Washington, DC: Center for Strategic & International Studies. Hunter, A., Sanders, G., McCormick, R., Ellman, J., & Riley, M. (2015). Measuring the success of acquisition reform by major DoD components. In Proceedings of the 12th Annual Acquisition Research Symposium (Vol. 1). Monterey, CA: Naval Postgraduate School. Justification & Approval. (2016). In Defense Acquisition University Acquipedia. Retrieved from https://dap.dau.mil/acquipedia/Pages/ArticleDetails.aspx?aid= 70a60a2f-c14b-4513-b32f-49afa3434999 Kendall, F. (2013). Implementation directive for Better Buying Power 2.0 - Achieving greater efficiency and productivity in defense spending. Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics. Kendall, F. (2015). Implementation directive for Better Buying Power 3.0 - Achieving dominant capabilities through technical excellence and innovation. Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Liebman, J. B., & Mahoney, N. (2013). Do expiring budgets lead to wasteful year-end spending? Evidence from federal procurement (NBER Working Paper No. 19481). Cambridge, MA: National Bureau of Economic Research. doi:10.3386/w19481 Liedke, E. J., & Simonis, J. D. (2014). Increasing competitive actions: A focus on technical data rights associated with non-commercial hardware items. Monterey, CA: Naval Postgraduate School. Moore, N. Y., Grammich, C. A., & Mele, J. D. (2015). Findings from existing data on the Department of Defense industrial base: Guided missile and space vehicle

462 manufacturing example. In Proceedings of the 12th Annual Acquisition Research Symposium (pp. 347–365). Monterey, CA: Naval Postgraduate School. Rogerson, W. P. (1994). Economic incentives and the defense procurement process. The Journal of Economic Perspectives, 8(4), 65–90. Retrieved from http://www. jstor.org.libproxy.mit.edu/stable/2138339 Sanders, G., Ellman, J., & Cohen, S. (2015). Competition and bidding data as an indicator of the health of competition and bidding data as an indicator of the health of the U. S. defense (Vol. HQ0034-12). Washington, DC: Center for Strategic and International Studies. Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22. USASpending.gov. (2015). In Data Downloads. Retrieved from https://www.usa spending.gov/DownloadCenter/Documents/USAspending.govDownloads DataDictionary.pdf

Endnotes 1 FY07 NDAA, Pub. L. No. 109-364 § 802, 120 Stat. 2083 (2007); FY10 NDAA, Pub. L. No. 111-84 § 821, 123 Stat. 2190 (2010); FY11 NDAA, Pub. L. No. 111-383 § 824, 124 Stat. 4137 (2011), FY12 NDAA, Pub. L. No. 112-81 § 815,125 Stat. 1298 (2012); and FY16 NDAA, Pub. L. No. 114-92 § 813, 129 Stat. 726 (2016).

2 The federal government operates on a fiscal year that begins on October 1 and ends September 30 of the following year.

3 Justification and Approval is a document required to justify and obtain appropriate level approvals to contract without providing for full and open competition as required by the Federal Acquisition Regulation (Justification and Approval, 2016).

4 Other than full and open competition is defined as any sole source or limited competition contract action that does not provide an opportunity for all responsible sources to submit proposals.

5 To cross-validate the quality of data, the FPDS totals in each fiscal year were compared to the totals reported on the USASpending.gov website for the DoD. The total obligations in each FPDS fiscal year are within a tolerance of 0.02 percent of the totals listed on the USASpending.gov website. The relatively trivial disparities between the data reported on USASpending.gov and those in FPDS are most likely explained by updates to FPDS between USASpending.gov versions (the data used herein is from the USASpending.gov version [July 15, 2016]).

6 For a full list of variables, see the USASpending.gov data dictionary (USASpending.gov, 2015).

7 Product Service Codes (PSC) that represent products are 4-digit codes (e.g., 7110: OFFICE FURNITURE), whereas PSCs that represent services are alpha numeric 4-digit codes: (e.g., R425: ENGINEERING AND TECHNICAL SERVICES) codes.

463 8 The full CSIS PSC classification tables (for products, services, and R&D) are available on the CSIS GitHub data repository: https://github.com/CSISdefense/ Lookup-Tables/blob/master/ProductOrServiceCodes.csv.

9 Of the $6 billion in contract actions, $126,630,720 are foreign military sales.

10 Governmentwide Point of Entry (GPE) means the single point where government business opportunities greater than $25,000, including synopses of proposed contract actions, solicitations, and associated information, can be accessed electronically by the public. The GPE is located at http://www.fedbizopps.gov. (48 CFR 2.101 Definitions)

464 Author Biographies

Maj Christopher Berardi, USAF, is a chief of staff of the Air Force Captains Prestigious PhD Fellows at the Massachusetts Institute of Technology (MIT). Prior to beginning his PhD, he served as both an acquisition and intelligence officer in the U.S. Air Force. Maj Berardi received a Master’s degree in Engineering and Management from MIT and a Bachelor of Science degree in Management from the United States Air Force Academy.

(E-mail address: [email protected])

Dr. Bruce Cameron is a lecturer in Engineering Systems at MIT and a consultant on platform strategies. His research interests include tech- nology strategy, system architecture, and the management of product platforms. Dr. Cameron received his undergraduate degree from the University of Toronto, and a Master’s degree in Technology Policy and PhD in Engineering Systems from MIT.

(E-mail address: [email protected])

465 Dr. Ed Crawley is a professor of Aeronautics and Astronautics and Engineering Systems at MIT. Professor Crawley received a Doctor of Science degree in Aerospace Structures from MIT. His early research interests centered on structural dynamics, aeroelasticity, and the development of actively controlled and intelligent structures. Recently, Dr. Crawley’s research has focused on the domain of the architecture and design of complex systems.

(E-mail address: [email protected])

466 467 The Impact of a BIG DATA Decision Support Tool on Military Logistics: MEDICAL ANALYTICS MEETS THE MISSION

Felix K. Chang, Christopher J. Dente, and CAPT Eric A. Elster, USN

Using big data and predictive analytics, more segments of the U.S. military will be able to create decision support tools that help them not only to carry out their missions more efficiently, but also to streamline their logistical requirements. Within the military’s medical community, the Surgical Crit- ical Care Initiative (SC2i) created one such tool that enables physicians to accurately assess the need for massive blood transfusions. To quantify the impact that tool could have on military logistics, SC2i developed a combat model that simulated a military campaign between NATO and Russian forces in eastern Poland and the Baltics. SC2i found that its tool would reduce NATO’s blood product consumption by 71,459 units, eliminating the need for 110 helicopter resupply missions and saving 25,740 gallons of fuel and 129,366 pounds of airlift capacity.

DOI: https://doi.org/10.22594/dau.16-769.24.03 Keywords: Predictive Analytics, Combat Modeling, Decision Science, SC2i

The Impact of a Big Data Decision Support Tool on Military Logistics http://www.dau.mil

Military logisticians have long recognized the efficiency of using deci- sion support tools to streamline logistical systems. In the 1970s, the U.S. military began using automated linear-programing software to better stock and distribute military materiel. Usage of such tools gradually expanded to include those involved in the acquisition, maintenance, and distribution of parts for large and complex military systems. Hence, logisticians and those who were responsible for such systems came to take a leading role in devel- oping many of the military’s most robust decision support tools.

Logistical Benefits from Nonlogistical Decision Support Tools If the U.S. military is to further streamline its logistics, logisticians should encourage other segments of the military to enhance their own capabilities through the greater use of big data-driven decision support tools. Such tools can increase efficiency not only in opera- tional units, but also in the logistics that support them. Any reduction in the volume and weight of sup- plies (particularly perishable ones) needed for operational units to achieve their missions imparts added resilience to existing logis- tical capacity.

Today, the cost and time needed to develop new decision support tools are steadily decreasing, especially when developed with machine-learning technology rather than human stat- isticians alone. As a result, many segments of the military, which have not historically used big data or pre- dictive analytics, can more easily do so now.

One of those segments has been the U.S. military’s medical community. Traditionally, military medical researchers exclusively focused their energies on improving the care of wounded warfighters. Now they have begun to use the growing amount of data from clinical records, laboratory tests, and electronic medical monitors

470 Defense ARJ, July 2017, Vol. 24 No. 3 : 468–487 July 2017

to create clinical decision support (CDS) tools that serve dual purposes. While the new tools still help physicians better treat their patients, what often goes unnoticed is that those tools also create substantial logistical benefits. The Surgical Critical Care Initiative (SC2i)—a leading U.S. military health research program in the development of CDS tools—conducted this study to describe the medical impetus behind one such tool and to quantify its clinical and logistical benefits (Buchman et al., 2016; Military Health System Communications Office, 2016).

Measuring the Benefits of a Clinical Decision Support Tool Over the last decade, medical studies have demonstrated how mas- sive infusions of blood products can improve the clinical outcomes of patients with traumatic injuries (Allcock et al., 2011; Maciel et al., 2015; McDaniel, Etchill, Raval, & Neal, 2014; O’Keeffe, Refaai, Tchorz, Forestner, & Sarode, 2008). Those infusions are called massive transfusion protocols (MTP). They typically involve the infusion of large quantities of red blood cells (RBC), fresh frozen plasma (FFP), platelets, and cryoprecipitate in a fixed ratio.

Recognizing the clinical value of MTPs, military physicians actively used them to treat wounded warfighters during the latter stages of Operations Enduring Freedom and Iraqi Freedom (Beekley, Bohman, & Schindler, 2012, p. 25). Fortunately, sufficient blood products were on hand at the Level III military hospitals supporting those operations to handle the increased demand. Yet, that was largely because those operations produced relatively low casualty rates, particularly after 2007. Higher casualty rates would have quickly drained the blood product inventories of those hospitals, given their limited blood storage capacities and the perishability of the blood products themselves.

Further complicating matters, the medical assessment as to whether a patient needs an MTP is complex to make, even for well-trained physicians. It is an assessment that forces them to weigh multiple disparate factors related to a patient’s acuity and mechanism of injury. Many physicians get it wrong (Wijaya, Cheng, & Chong, 2016). They needlessly activate MTPs and waste blood products in the process.

To reduce that wastage, SC2i developed a CDS tool that enables physicians to quickly and accurately identify which patients require an MTP based on their individual anatomic and biological data. Atlanta’s largest trauma

Defense ARJ, July 2017, Vol. 24 No. 3 : 468–487 471 center, which annually treats nearly a thousand patients with penetrating wounds, is now using SC2i’s MTP CDS tool in a prospective observation trial. A study at the trauma center concluded that such a tool could lower the MTP activations by as much as 17.9 percent while maintaining posi- tive clinical outcomes (Dente et al., 2010; Shaz, Dente, Harris, MacLeod, & Hillyer, 2009).

Such blood product savings clearly make blood banks more efficient (Haldiman, Zia, & Singh, 2014; O’Keeffe et al., 2008). They also provide the U.S. military with other benefits, particularly when it is involved in high-in- tensity combat. Keeping deployed Level III military hospitals, like the U.S. Army’s combat support hospitals (CSH), fully supplied with blood products requires a multitude of resources. Enabling physicians to more accurately decide whether an MTP is necessary means that fewer blood products would be wasted, which in turn means that military logistics would need to resupply CSHs less often, freeing up logistical resources that can support other medical or combat requirements.

Modeling Methods To fully understand the scale of the benefits from SC2i’s MTP CDS tool, it must be put into the context of a military campaign. The campaign scenario that SC2i chose was based on one created for an unclassified U.S. Department of Defense (DoD) contingency planning study in the 1990s (Tyler, 1992a; 1992b). SC2i’s scenario focused on Russia, the top security concern of the last two chairmen of the U.S. Joint Chiefs of Staff (Shinkman, 2015; Stewart & Alexander, 2015). It envisioned a broad-front NATO military campaign to liberate eastern Poland and the Baltics after a Russian invasion. That scenario turned out to be particularly relevant, as variations of it have recently been used as the foundation for NATO’s Sabre Strike exercise and DoD-sponsored wargaming (Sharkov, 2016; Shlapak & Johnson, 2016).

SC2i developed that scenario into a combat simulation to estimate the vol- ume and rate of casualties that a campaign would produce. Those results were then fed into a casualty relevance model to determine the number of daily casualties that would likely require an MTP. Finally, those casualty figures enabled SC2i to assess the difference in daily blood product usage with and without its MTP CDS tool, and ultimately the tool’s logistical and operational benefits.

Combat simulation. Like the original DoD study, SC2i employed Lanchester’s square law force-exchange framework to power its combat

472 simulation (Davis, 1990; Johnson, 1990; Kaufmann, 1992, pp. 57–59). That

framework is embodied in the equations below:

( 1/2 ( 1/2

r 1/2 r (rb) t -(rb)1/2t

Bt = B– R e + B– R e .5

[( b1/2 ( b1/2 ] ( 1/2 ( 1/2 b 1/2 b (rb) t -(rb)1/2t Rt = R– B e + R– B e .5 [( r1/2 ( r1/2 ] B = NATO combat power R = Russian combat power b = NATO combat effectiveness r = Russian combat effectiveness t = Combat days from the start of the campaign

SC2i did, however, update three of the DoD study’s assumptions to better reflect the modern combat conditions of its scenario (Table 1). Using those updates, SC2i’s combat simulation calculated the number of likely casualties to be generated each combat day (i.e., a day in which all forces are engaged) over the course of its campaign scenario. In total, the simulation estimated that NATO would suffer 33,806 casualties, of whom 6,938 would be killed in action (KIA) and 26,868 wounded in action (WIA), before it achieved victory over its Russian foe (Table 2).

TABLE 1. COMBAT SIMULATION ASSUMPTIONS Surgical Critical U.S. Department Assumption Care Initiative, Rationale of Defense, 1992 2017

NATO and Standing militaries in Russian division NATO and Russia are equivalents 24 15 smaller today than modeled in the they were during the combat simulation Cold War

Technological gap between NATO and Russian forces Combat power has narrowed, ratio between 1.4 1.1 because the pace NATO and Russian of Russian military forces modernization has oustripped NATO’s over the last decade

Combat formations Percentage of are rendered enemy forces 100% 50% ineffective well destroyed to before they are achieve victory completely destroyed

473 0 2,211 2,174 2,521 2,612 2,198 2,186 2,419 2,314 2,252 2,282 2,225 2,587 2,267 2,479 2,347 2,238 2,542 2,565 2,383 2,458 2,298 2,438 2,330 2,364 2,500 2,400 61,588 Russian Russian Wounded in Action 0 571 651 561 615 610 574 578 597 674 657 593 582 625 585 635 662 568 589 645 668 620 564 602 630 606 640 15,904 Russian Killed in Action 0 2,911 3,119 3,172 2,871 3,145 2,891 2,816 3,199 2,782 2,735 2,974 2,932 2,766 3,227 2,852 2,799 3,255 2,750 2,953 3,286 2,834 2,998 3,093 3,020 3,068 3,044 77,492 Russian Russian Casualties

0 931 851 772 745 797 877 824 720 958 986 694 904 1,126 1,183 1,156 1,214 1,041 1,014 1,272 1,242 1,333 1,362 1,302 1,070 1,394 1,098 26,868 NATO WoundedNATO in Action 0 213 321 179 241 313 291 193 186 199 227 276 352 262 329 283 336 248 269 234 298 254 220 206 360 344 306 6,938 NATO KilledNATO in Action 0 971 873 938 906 1,172 1,138 1,417 1,714 1,381 1,071 1,601 1,310 1,104 1,527 1,276 1,677 1,754 1,037 1,638 1,563 1,454 1,346 1,489 1,240 1,206 1,003 33,806 TABLE 2. COMBAT SIMULATION RESULTS NATO Casualties ) t 8.35 7.80 8.08 10.06 9.48 9.20 8.92 8.63 7.52 13.39 14.02 13.70 11.24 9.77 10.35 7.25 13.07 15.00 14.67 14.35 10.94 10.64 12.76 11.84 11.54 12.45 12.14 Russian Combat Power (R ) t 13.95 13.79 13.87 14.50 14.31 14.21 14.12 14.04 13.71 15.79 16.06 15.93 14.93 14.40 14.61 13.64 15.66 16.50 16.35 16.21 14.82 14.71 15.53 15.16 15.04 15.40 15.28 NATO CombatNATO Power (B 22 24 23 16 18 19 20 21 25 5 3 4 Total 12 17 15 26 6 0 1 2 13 14 7 10 11 8 9 Combat Day (t)

474 Casualty relevance model. SC2i’s model then iteratively narrowed down which among those estimated casualties would likely be candidates for a massive transfusion. First, the model distinguished between all those casualties who were either KIA or returned to duty (RTD) before they were treated at a hospital, from those WIA who were treated at a hospital, since only hospitalized patients could receive a massive transfusion. To ascertain how many casualties were likely to be KIA, RTD, and treated WIA, SC2i sought data from a historical campaign wherein its high-intensity combat mirrored what NATO could expect to encounter during its campaign scenario. Hence, SC2i selected the experience of the U.S. First and Third Armies in eastern France and Germany during World War II. Both armies fought over terrain and were exposed to the type and volume of munitions that NATO forces would likely face in eastern Poland and the Baltics.

From 1944 to 1945, the U.S. First and Third Armies suffered 752,396 casualties. Of those, 152,359 casualties were KIA (or 20.2 per- cent of the total) and approximately 150,000 (or 19.9 percent of the total) were RTD. The remaining 450,037 (or 59.8 percent of the total) were WIA who were treated at field hospitals (Holcomb, Stansbury, Champion, Wade, & Bellamy, 2006; Reister, 1975, p. 4). Using those historical per- centages, SC2i’s model determined the number of treated WIA each combat day of its scenario.

Second, the model sought to ascertain which among those treated WIA would have received wounds severe enough to prompt physicians to consider using a massive transfusion. To do so, SC2i examined the impact that casu- alty-causing agents could have in producing such wounds. Again, SC2i turned to the experience of the U.S. First and Third Armies. Military records from the two armies revealed that of their combined 217,070 wounded, 24.6 percent suffered wounds from small arms, 60.2 percent from shell fragments, 3.2 percent from blasts, 4.9 percent from bombs, 1.2 percent from burns, and 5.9 percent from all other casual- ty-causing agents (Beyer, Arima, & Johnson, 1962; Beyer et al., 1962, p. 77). Judging that small arms, shell fragments, blasts, and bombs would be the most relevant agents, SC2i used their historical incidence to better estimate the number of likely MTP candidates.

475 To improve on that estimate, SC2i studied the likely severity of wounds based on their physical locations. Once again, SC2i sought data from a historical campaign. This time, however, it had to choose one in which American troops were not only engaged in sustained high-intensity combat, but also equipped with body armor commensurate with that of their poten- tial modern contemporaries. SC2i chose the experience of the U.S. Eighth Army during the last year of the Korean Conflict. Of the U.S. Eighth Army’s casualties who wore body armor, 14.2 percent suffered wounds to the head, 2.7 percent to the neck, 4.7 percent to the chest, 4.0 percent to the upper back, 9.2 percent to the lower back, 1.6 percent to the abdomen, 34.6 percent to an upper extremity, 28.4 percent to a lower extremity, and 6.0 percent to the genitalia (Herget, Coe, & Beyer, 1962, p. 733). SC2i then asked physicians from Walter Reed National Medical Center and the Uniformed Services University of the Health Sciences to assess the likelihood that wounds in those areas of the body would be severe enough to warrant the consideration of a massive transfusion. The combination of historical data and physicians’ assessments enabled SC2i to further refine its model. Ultimately, it found that 11,556 wounded—about one-third of the total casualty population— would be likely MTP candidates.

Logistical benefits model. With an estimate of likely MTP candidates each combat day, SC2i’s model could determine the daily demand for blood products. It could also determine the difference in that demand with and without the use of SC2i’s MTP CDS tool, and thus calculate the blood prod- uct savings that the tool could generate. However, before the model could do so, it required a few programmed assumptions. It assumed that NATO, following doctrine, would assign one CSH to each of its five three-division corps (Lewis et al., 2010, p. xi). It also assumed that NATO would adopt SC2i’s MTP blood product ratio of 16.5 units of RBC, 9.8 units of FFP, 0.9 apheresis of platelets, and 7.2 units of cryoprecipitate as its standard massive transfusion protocol, rather than the U.S. Armed Services Blood Program (ASBP)’s suggested blood product ratio (Departments of the Army, Navy, & Air Force, 2011, p. 43). SC2i did so to ensure that its model could make a clear-cut comparison of blood product demand with and without the use of its tool.

Moreover, SC2i’s model had to contend with the fact that blood products are consumed by not only wounded who require MTPs, but also those who do not. To account for the latter’s use, SC2i assumed that they would consume the ASBP’s suggested ratio of 3.0 units of RBC, 1.6 units of FFP, and 0.15 apheresis of platelets per patient.

476 SC2i’s model also had to consider the inventory of blood products that each CSH would initially carry with it in-theater. The model assumed that each CSH would deploy with an inventory of 300 units of RBC, 100 units of FFP, 24 apheresis of platelets, and 100 units of cryoprecipitate, which is consistent with the recommendations found in Emergency War Surgery (Department of the Army, 2013). Of course, that does not always happen. When the 31st CSH deployed to Iraq in 2010, it stocked only 180 units of RBC, 160 units of FFP, and 90 units of cryoprecipitate (Luschinski, 2011).

As the technology needed to capture and analyze big data becomes more widely available and at ever-lower cost, even military elements not directly engaged in logistics can create new decision support tools—tools that not only enable them to more efficiently carry out their missions, but also reduce their logistical footprints.

Even so, given the high casualty rate of its combat simulation, SC2i’s model expected that all of the deployed CSHs would quickly exhaust their blood product inventories. As in the past, CSHs would call upon in-theater blood donors to replenish some of their blood supplies. SC2i estimated that such “walking blood banks” would provide 1,200 units of RBC every combat day across all five CSHs. NATO would have to source the remaining blood product shortfall from outside the theater, most likely from the ASBP’s blood reserve in the continental United States (Armed Services Blood Program, n.d.).

The task of resupplying the CSHs with fresh blood products would then fall on U.S. military logistics. Transporting such perishable supplies from the United States to CSHs near the frontline requires a considerable effort. It begins with the packaging of blood products into ASBP-standard ship- ping containers. The containers are then assembled into groups of 120 and placed on pallets—those carrying RBC weighing 5,400 pounds and those carrying other blood products weighing 4,680 pounds (Departments of the Army, Navy, & Air Force, 2007, p. 41; 2011, pp. 45–46). U.S. Air Mobility Command (AMC) would then transport those pallets into theater airheads where blood distribution detachments would divide the pallets into smaller blood product shipments and route them to corps-level airfields. From there, UH-60 MEDEVAC helicopters would fly the shipments on blood product

477 resupply missions to individual CSHs, as they have done in every U.S. mil- itary campaign since the Persian Gulf Conflict (Cholek & Anderson, 2007; Department of Defense, 1992, p. 463; Department of the Army, 2005, p. H-6).

Since UH-60 pilots are generally required to hold a 20- to 30-minute fuel reserve, a UH-60 operating at its maximum range would be expected to consume about 85 percent of its 360-gallon internal fuel load (IHS Jane’s, 2008; M. Crivello, personal communication, December 26, 2014). Since not all blood product resupply missions require UH-60s to operate at their maximum range, SC2i’s model estimated that the average blood product resupply mission would consume only 65 percent of their internal fuel load.

According to the ASBP, each UH-60 MEDEVAC helicopter can carry up to 50 standard shipping containers (Departments of the Army, Navy, & Air Force, 2011, p. 44). While that might be true for empty helicopters, MEDEVAC helicopters operating in combat would be loaded with armor, racks, and medical equipment. Given the large number of wounded that would require battlefield evacuation during a high-intensity conflict, UH-60 crews are unlikely to reconfigure their helicopters to fly an occasional blood product resupply mission. Moreover, a veteran pilot revealed that the most a combat-configured UH-60 MEDEVAC helicopter could carry is 30 standard shipping containers (Ginn, Ferencz, & Marble, 2008; M. Crivello, personal communication, May 7, 2015). Hence, that is the number SC2i’s model used.

By linking the blood products’ savings that SC2i’s MTP CDS tool could produce with the resources that would have been needed to transport the products to CSHs in the field, SC2i’s model quantified the logistical benefits that would accrue from the tool over the course of its campaign scenario.

Modeling Results Impact on blood product usage. Given that SC2i’s MTP CDS tool was principally designed to support medical decision making, it is no surprise that the tool improves the treatment of critically injured patients. It allows physicians to quickly and accurately determine whether such patients require an MTP, enabling them to reap the full clinical benefit from an early MTP activation. In so doing, the tool also reduces the number of unneeded MTP activations and the associated waste in blood products. SC2i’s model estimates that if NATO’s five CSHs use the tool, they would waste 71,459 fewer units of blood products over the course of SC2i’s campaign scenario. Those blood products would include 34,144 units of RBC, 20,399 units of FFP, 1,907 units of platelets, and 15,009 units of cryoprecipitate (Table 3).

478 6,135 5,166 4,617 5,541 4,107 4,914 3,765 3,726 6,735 5,373 3,597 3,978 6,522 3,423 4,785 5,295 5,037 5,967 5,838 4,359 4,236 6,438 6,354 4,488 3,300 Airlift Airlift Weight Capacity (pounds) 129,366 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 5,670 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 1,170 25,740 Jet-A Fuel (gallons) 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 0 0 0 0 0 0 0 0 110 UH-60 Blood Product Resupply Missions 81 78 97 85 88 89 94 113 131 141 119 116 127 122 152 125 138 159 145 134 154 103 150 109 106 100 Blood Product Containers 711 551 613 431 416 661 727 745 475 779 582 567 678 536 629 598 388 520 645 505 694 445 402 490 460 15,009 3,056 Cryo- precipitate (units) 51 72 74 57 97 761 78 76 82 55 53 62 92 58 95 88 70 68 86 99 66 49 84 64 80 60 90 1,907 Platelet Platelet (units) Clinical and Operational Savings from SC2i MTP CDS Tool 812 921 527 877 728 748 707 547 855 833 626 770 586 566 898 943 988 666 790 966 605 646 1,012 1,058 1,034 20,399 Fresh Frozen Plasma (units) 915 981 882 947 1,115 1,731 1,617 1,218 1,149 687 1,431 1,013 1,184 1,772 1,252 1,579 1,359 1,395 1,542 1,289 1,047 1,654 1,082 1,694 1,469 1,504 34,144 Red Blood Blood Red Cell (units) 11,241 15,511 9,401 9,767 11,619 17,163 13,149 10,122 14,313 8,680 16,331 11,996 17,595 12,373 16,752 15,932 12,750 13,925 13,526 1,323 10,876 15,830 10,498 18,005 With SC2i MTP CDS Tool (units) 15,371 18,196 15,123 11,820 21,140 11,384 13,140 17,233 19,160 17,708 14,713 10,961 9,046 13,574 12,242 21,628 14,473 15,847 19,635 16,770 20,136 19,538 14,922 18,658 16,295 10,525 14,025 12,690 20,626 Blood Product Usage Without Without SC2i MTP CDS Tool (units) TABLE 3. CASUALTY RELEVANCE AND LOGISTICAL BENEFITS MODELS RESULTS 321 412 310 401 573 377 522 472 332 547 497 424 298 534 343 586 389 354 366 436 560 509 448 484 460 600 11,556 MTP- Relevant Wounded in Action 931 851 772 745 797 877 824 720 958 986 694 904 1,126 1,183 1,156 1,214 1,041 1,014 1,272 1,242 1,333 1,362 1,302 1,070 1,394 1,098 26,868 NATO Wounded in Action 2 3 4 Combat Day 1 12 5 6 7 11 8 13 9 14 15 10 16 17 26 25 18 22 19 20 24 21 23 Total

479 Impact on military logistics. Beyond the blood product savings it could generate, SC2i’s MTP CDS tool also yields a number of operational benefits. SC2i’s model estimates that the tool’s use would enable UH-60 MEDEVAC helicopters to fly 110 fewer blood product resupply missions during SC2i’s campaign scenario. Instead, those UH-60s could fly medical evacuation missions. Given a helicopter’s maximum airlift capacity, that means existing corps-level UH-60 resources could evacuate up to 770 more wounded off the battlefield.

Should no additional wounded require evacuation, SC2i’s model estimates that the fewer num- ber of blood product resupply missions would save a total of 25,740 gallons of Jet-A fuel. Since NATO AH-64 attack helicopters also consume Jet-A fuel and have a similar fuel capac- ity as the UH-60, they could use the saved fuel to fly an additional 110 close- air support missions.

The broader logistical bene- fits are equally meaningful. The lower need for blood products would also reduce the need for AMC fixed-wing aircraft to ferry blood prod- ucts in-theater. That would free up space and weight aboard AMC flights that NATO could use to carry other military sup- plies. SC2i’s model estimates that the lower blood product demand would mean that the AMC would have to transport 3,056 fewer blood product shipping containers. Those con- tainers have a combined weight of 129,366 pounds. That is equivalent to the combined

480 weight of 308 AGM-114 missiles, 1,463 Hydra 70 rockets, and 92,400 rounds of 30mm ammunition—enough to arm 77 of the additional AH-64 close-air support missions that SC2i’s MTP CDS tool could enable U.S. Army aviation units to fly (Figure).

FIGURE. SUMMARY OF CLINICAL & LOGISTICAL BENEFITS FROM SC2I’S MTP CDS TOOLS Blood Product Savings Jet-A Fuel Savings MEDEVAC Wounded Close-Air Support Missions 25,740 gallons 770 110

129,366 pounds 308 AGM-114 1,463 Hydra 70 34,144 Red Blood Cell 92,400 30mm 20,399 Fresh Frozen Plasma 1,907 Platelets Airlift Weight Close-Air Support 15,009 Cryoprecipitate Capacity Savings Ordnance

Conclusions/Recommendations This study illustrated the scale of the benefits that could accrue from using SC2i’s MTP CDS tool during a military campaign. Although military medical researchers primarily designed the tool to help physicians make faster and more accurate decisions as to whether to activate an MTP and thus improve patient survival and recovery, it served a dual purpose. The tool’s developers also sought to lower the unneeded expenditure of scarce blood products, streamlining the logistical requirements of CSHs in the field.

Past efforts to streamline logistics have largely been led by logisticians and those who manage large and complex military systems. This study demon- strates the degree to which all military organizations can make meaningful contributions to streamlining logistics by using big data and predictive analytics to improve their own operations. More elements of the military can and should become involved. As the technology needed to capture and analyze big data becomes more widely available and at ever-lower cost,

481 military elements not directly engaged in logistics can create new decision support tools—tools that not only enable them to more efficiently carry out their missions, but also reduce their logistical footprints.

SC2i’s model quantified what one decision support tool could do to ease the burden on military logistics by reducing waste in a single supply category— blood products. The U.S. military can develop and deploy many more tools in the future. They and the predictive analytics that underlie them should be encouraged to flourish.

Acknowledgments We would like to acknowledge the important contributions of Michael J. Crivello, Dr. Frederick Lough, and Dr. Benjamin Kyle Potter in the Department of Surgery at the Uniformed Services University of the Health Sciences and Walter Reed National Military Medical Center.

482 References Allcock, E. C., Woolley, T., Doughty, H., Midwinter, M., Mahoney, P. F., & Mackenzie, I. (2011). The clinical outcome of UK military personnel who received a massive transfusion in Afghanistan during 2009. Journal of the Royal Army Medical Corps, 157(4), 365–369. Armed Services Blood Program. (n.d.). Armed Forces Blood Program Facts [Fact sheet]. Retrieved from http://www.med.navy.mil/sites/nmcp/Dept/Shared Documents/BloodBank/BloodProgramFactSheet.pdf Beekley, A. C., Bohman, H., & Schindler, D. (2012). Combat casualty care: Lessons learned from OEF and OIF. In E. Savitsky & B. Eastridge (Eds.), Modern warfare. Washington, DC: Department of the Army, Office of the Surgeon General. Beyer, J. C., Arima, J. K., & Johnson, D. W. (1962). Enemy ordnance materiel. In J. C. Beyer (Ed.), Wound ballistics. Washington, DC: Department of the Army, Office of the Surgeon General. Buchman, T. G., Billiar, T. R., Elster, E., Kirk, A. D., Rimawi, R. H., Vodovotz, Y., & Zehnbauer, B.A. (2016). Precision medicine for critical illness and injury. Critical Care Medicine, 44(9), 1635–1638. Cholek, C. B., & Anderson, M. A., Sr. (2007). Distribution-based logistics in Operation Iraqi Freedom. Army Logistician, 39(2), 60. Davis, P. K. (1990). Aggregation, disaggregation, and the 3:1 rule in ground combat. Santa Monica, CA: RAND. Dente, C. J., Shaz, B. H., Nicholas, J. M., Harris, R. S., Wyrzykowski, A. D., Ficke, B. W., Vercruysse, G. A., … & Ingram, W. L. (2010). Early predictors of massive transfusion in patients sustaining torso gunshot wounds in a civilian level I trauma center. Journal of Trauma, 68(2), 300–303. Department of the Army. (2005). Theater hospitalization (FM 4-02.10). Washington, DC: Author. Department of the Army. (2013). Battlefield transfusions. In Emergency war surgery (4th ed.). Washington, DC: Office of the Surgeon General & Borden Institute. Departments of the Army, Navy, & Air Force. (2007). Operational procedures for the Armed Services Blood Program elements (TM 8-227-11, NAVMED P-5123, AFI 44- 118). Washington, DC: Author. Departments of the Army, Navy, & Air Force. (2011). Armed Services Blood Program: Joint Blood Program handbook (TM 8-227-12, NAVMED P-6530, AFH 44-152_IP). Washington, DC: Author. Department of Defense. (1992). Medical support (Appendix G). In Conduct of the Persian Gulf War: Final report to Congress. Washington, DC: Author. Ginn, R.V.N., Ferencz, A., & Marble, S. (Eds.). (2008). In their own words: The 498th in Iraq, 2003. Falls Church, VA: Office of Medical History, Office of the Surgeon General, Headquarters, U.S. Army Medical Command. Haldiman, L., Zia, H., & Singh, G. (2014). Improving appropriateness of blood utilization through prospective review of requests for blood products: The role of pathology residents as consultants. Laboratory Medicine, 45(3), 264–271. Herget, C. M., Coe, G. B., & Beyer, J. C. (1962). Wound ballistics and body armor in Korea. In J. C. Beyer (Ed.), Wound ballistics. Washington, DC: Department of the Army, Office of the Surgeon General.

483 Holcomb, J. B., Stansbury, L. G., Champion, H. R., Wade, C., & Bellamy, R. F. (2006). Understanding combat casualty care statistics. Journal of Trauma, 60(2), 397–401. IHS Jane’s. (2008). Sikorsky S-70 (H-60) - US Army UH-60Q DUSTOFF MEDEVAC Conversion. Jane’s Aircraft Upgrades. Retrieved from https://my.ihs.com/ Janes?th=JANES&callingurl=https://janes.ihs.com Johnson, R. L. (1990). Lanchester’s square law in theory and practice. Fort Leavenworth, KS: U.S. Army Command and General Staff College, School of Advanced Military Studies. Kaufmann, W. W. (1992). Assessing the base force. Washington, DC: Brookings Institution. Lewis, M. W., Bower, A., Cuyler, M. T., Eden, R., Harper, R.E., Morganti, K. G., … Valdez, R. S. (2010). New equipping strategies for combat support hospitals. Santa Monica, CA: RAND. Luschinski, A. (2011). Laboratory services in austere environment: Camp Dwyer, Afghanistan. Society Scope, 14(2), 9. Maciel, J. D., Gifford, E., Plurad, D., de Virgilio, C., Bricker, S., Bongard, F., … & Kim, D. (2015). The impact of a massive transfusion protocol on outcomes among patients with abdominal aortic injuries. Annals of Vascular Surgery, 29(4), 764–769. McDaniel, L. M., Etchill, E. W., Raval, J. S., & Neal, M. D. (2014). State of the art: Massive transfusion. Transfusion Medicine, 24(3), 138–144. Military Health System Communications Office. (2016, March 7). Precision medicine research paving the way for smarter, more effective treatment. Health.mil. Retrieved from http://www.health.mil/News/Articles/2016/03/07/Precision- medicine-research-paving-the-way-for-smarter-more-effective-treatment O’Keeffe, T., Refaai, M., Tchorz, K., Forestner, J. E., & Sarode, R. (2008). A massive transfusion protocol to decrease blood component use and costs. Archives of Surgery, 143(7), 686–691. Reister, F. A. (1975). Medical statistics in World War II. Washington, DC: Department of the Army, Office of the Surgeon General. Sharkov, D. (2016, May 29). US kicks off 10,000-strong drill in Eastern Europe. Newsweek. Retrieved from http://www.newsweek.com/us-kicks-strong-drill- eastern-europe-464678 Shaz, B. H., Dente, C. J., Harris, R. S., MacLeod, J. B., & Hillyer, C. D. (2009). Transfusion management of trauma patients. Anesthesia and Analgesia, 108(6), 1760–1768. Shinkman, P. D. (2015, March 4). Top military minds list top threats facing the U.S. U.S. News and World Report. Retrieved from http://www.usnews.com/news/ articles/2015/03/04/ash-carter-martin-dempsey-list-top-3-threats-facing-the-us Shlapak, D. A., & Johnson, M. W. (2016). Reinforcing deterrence on NATO’s eastern flank: Wargaming the defense of the Baltics. Santa Monica, CA: RAND Corporation. Stewart, P., & Alexander, D. (2015, July 9). Russia is top U.S. national security threat: Gen. Dunford. Reuters. Retrieved from http://www.reuters.com/article/us-usa- defense-generaldunsmore-idUSKCN0PJ28S20150709 Tyler, P. E. (1992a, February 17). Pentagon imagines new enemies to fight in post- Cold-War era. New York Times, pp. A1, A8.

484 Tyler, P. E. (1992b, February 17). 7 Hypothetical conflicts foreseen by the Pentagon. New York Times, p. A8. Wijaya, R., Cheng, H. M., & Chong, C. K. (2016). The use of massive transfusion protocol for trauma and non-trauma patients in a civilian setting: What can be done better? Singapore Medical Journal, 57(5), 238–241.

485 Author Biographies

Mr. Felix K. Chang is the chief strategy officer of DecisionQ. He is also an assistant professor in the Department of Surgery at the Uniformed Services University of the Health Sciences and Walter Reed National Military Medical Center. He is also a senior fellow at the Foreign Policy Research Institute. He previously served as a senior planner and an intelligence officer in the U.S. Department of Defense. Mr. Chang earned his MBA from Duke University and an MA and BA from the University of Pennsylvania.

(E-mail address: [email protected])

Dr. Christopher J. Dente is a professor in the Department of Surgery at Emory University’s School of Medicine and deputy chief of Surgery for Emory at Grady Memorial Hospital. He earned his MD from the Medical College of Pennsylvania and completed his general surgical training at Wayne State University/Detroit Medical Center. Dr. Dente also completed a trauma/surgical crit- ical care fellowship at Grady Memorial Hospital.

(E-mail address: [email protected])

486 CAPT Eric A. Elster, USN, is currently chair- man/professor of the Department of Surgery, Uniformed Services University of the Health Sciences and Walter Reed National Military Medical Center. He was last deployed as director of Surgical Services at NATO’s military hospital in Kandahar, Afghanistan. He has published over 140 scientific manuscripts and received numerous research grants. CAPT Elster earned his MD from the University of South Florida, completed a surgery residency at National Naval Medical Center, and received an organ transplantation fellowship at National Institutes of Health.

(E-mail address: [email protected])

487 BEYOND INTEGRATION Readiness Level (IRL): A Multidimensional Framework to Facilitate the INTEGRATION OF SYSTEM OF SYSTEMS

Maj Clarence Eder, USAF (Ret.), Thomas A. Mazzuchi, and Shahram Sarkani Integration Readiness Level (IRL) can be an effective systems engineering tool to facilitate integration of systems. Further research in systems integration, analysis of integration data, and the development of systems architecture framework can help enhance IRL principles for systems integration use in Department of Defense (DoD). IRL was developed to support Technology Readiness Level, but DoD never implemented IRL. Expanding the use of IRL can address growing integration challenges of DoD acquisition programs. DoD Space Systems are prime examples of System of Systems that can help identify attributes for an integration framework that can enhance IRL assessment.

DOI: https://doi.org/10.22594/dau.16-766.24.03 Keywords: Technology Readiness Level (TRL), Systems Integration, DoD Space Systems

 Image designed by Michael Krukowski Beyond Integration Readiness Level http://www.dau.mil

Integration Readiness Level (IRL) was introduced to help understand the maturity of integrating one system into another (Sauser, Gove, Forbes, & Ramirez-Marquez, 2010). The need to expand the use of IRL is becoming increasingly more relevant in U.S. Department of Defense (DoD) acquisition as program managers aim to develop and acquire weapon systems with ever increasing multiple capabilities and interfaces. Likewise, understanding integration feasibility early in the program is beneficial in man- aging and planning for the success of overall System of Systems (SoS) integration.

Throughout the years, the DoD acquisition com- munity has implemented several systems engineering processes and tools to help meet budgetary requirements while still trying to produce the best weapon systems available. Also, DoD Space Systems acqui- sition has increased in actual systems costs from the initial estimates while the capability of the systems decreased from original intent. According to a Government Accountability Office (GAO, 2011) report, “the total estimated costs for major space programs increased by about $13.9 billion from initial estimates for fiscal years 2010 through 2015, which is an increase of about 286 percent” (p. 4). This needs to be managed better within DoD, and additional tools are needed to understand future impacts to system delivery.

DoD also imple- mented initiatives in place to expedite the deployment of capabilities into opera- tions as part of the Urgent Warfighter Needs efforts.

490 Defense ARJ, July 2017, Vol. 24 No. 3 : 488–533 July 2017

Based on a GAO (2012) report, various practices have been implemented by program offices in order to meet challenges to deliver capabilities within short timelines. Having additional integration tools available to help program/product teams understand the feasibility of weapon systems deployment on aggressive timelines could prevent schedule delays.

Understanding the feasibility of integration early in the process can help temper the expectations needed throughout the development and deployment of new systems.

The DoD acquisition issues of uncontrollable cost growth and the need to expedite the deployment of capabilities into operations trigger the drive to improve systems engineering processes and tools that program man- agers can depend on when making program decisions. In order to make decisions about systems and technology availability, the DoD acquisition community adopted the use of Technology Readiness Level (TRL) in 2002 (DoDI 5000.02, 2017). TRL provides measurement for explaining the matu- rity of a system based on the technology used for that system.

To further the use of TRL, IRL was introduced as an integration tool to complement TRL as illustrated in Table 1 (Sauser et al., 2010). IRL was developed to align with the TRL definitions, but it was never officially imple- mented by DoD to help with integration assessment. Other readiness levels such as System Readiness Level (SRL) and Test Readiness Level were also introduced, but not officially recognized by the DoD acquisition community. Although not implemented, the use of IRL could become a necessary tool to help reduce integration risks of complex systems. Integrating SoS is becoming more complex, and the current definitions of IRL do not allow it to be independent of the TRL process, which could be one reason why IRL is heavily scrutinized in current systems engineering literature. IRL was developed to support the limitations of TRL in evaluating complex inter- faces. According to London, Holzer, Eveleigh, & Sarkani (2014), while TRL is used for “considering discrete technology elements, IRL is only limited to connecting these technology links between components” (p. 380).

Defense ARJ, July 2017, Vol. 24 No. 3 : 488–533 491 TABLE 1. INTEGRATION READINESS LEVEL (IRL) AND TECHNOLOGY READINESS LEVEL (TRL) LEVELS Level TRL IRL 1 Basic principles observed An interface between technologies has been and reported identified with sufficient detail to allow characterization of the relationship 2 Technology concept and/ There is some level of specificity to characterize or application formulated the interaction between technologies through their interface 3 Analytical and There is compatibility between technologies to experimental critical orderly and efficiently integrate and interact function and/or characteristic proof of concept 4 Component and/or There is sufficient detail in the quality and breadboard validation in assurance of the integration between technologies laboratory environment 5 Component and/or There is sufficient control between technologies breadboard validation in necessary to establish, manage, and terminate the relevant environment integration 6 System/subsystem model The integrating technologies can accept, translate, demonstration in relevant and structure information for its intended environment application 7 System prototype The integration of technologies has been verified demonstration in relevant and validated with sufficient detail to be actionable environment 8 Actual system completed Actual integration completed and mission-qualified and qualified through test through test and demonstration in the system and demonstration environment 9 Integration is mission- Execute a support program that meets operational proven through successful support performance requirements and sustains mission operations the system in the most cost-effective manner over its total life cycle

Note. As defined by Sauser et al. (2010)

Based on Sauser et al. (2010), IRL was introduced as, “a metric for system- atic measurement of the interfacing of compatible interactions for various technologies and the consistent comparison of the maturity between inte- gration points” (p. 18). The concept itself could be more useful, but it is currently limited to support TRL concepts (Table 1). The main concept of IRL is to be used to evaluate integration of two technologies that were evaluated by the TRL process.

The principles of IRL can be used and enhanced to facilitate the integration of SoS. As indicated by Ge, Hipel, Kewei, & Chen (2013), “SoS is defined as large-scale integrated systems that are heterogeneous and independently operable on their own, but are networked together for a common goal” (p. 363).

492 Many commonly used SoS terms such as interoperable and synergistic are important factors that need to be considered when integrating SoS (Madni & Sievers, 2014). The challenges of SoS integration are constant in DoD acquisition as technology continues to improve and new capabilities emerge as better options for the warfighter. Understanding the feasibility of integration early in the process can help temper the expectations needed throughout the development and deployment of new systems.

Problem Focus The focus of the research is to understand the integration issues and challenges of major DoD Space Systems so that they can be used to further analyze facilitation of SoS integration. As challenges continue when inte- grating new capability into a family of systems, the need for understanding the overall effort is necessary for upfront planning. The components that make up SoS usually have different purposes, and the different organiza- tions associated with those purposes typically use different terminologies and concepts. Thus, SoS integration must be able to address the differences in semantic and syntactic between their organizations. Identifying these issues ahead of time helps highlight areas that need additional focus on IRL assessment. DoD Space Systems have to satisfy integration requirements that may include affordability, reliability, security, resilience, and adapt- ability (Madni & Sievers, 2014).

Understanding systems integration itself is a very challenging task due to several factors that impact the integration processes. Systems integration can be interpreted differently, and although integration is a commonly used word, it is not clearly defined by both industry and academia on its applica- tion (Djavanshir & Khorramshahgol, 2007). As explained by Tien (2008), “integration can occur through functional, physical, organizational, and temporal dimensions, which can also include management and processes of the actual system” (p. 393). Elements involved in integration tend to be multidimensional, which can be more complicated than evaluating tech- nologies. Also, most current systems integration efforts focus on products instead of the process, which can lead to inadequate analyses when trying to understand the impacts of integrating components (Magnaye, Sauser, & Ramirez-Marquez, 2010). Integration of SoS is complicated, which goes beyond merely assembling components.

The focus of the research is to develop a framework to help address SoS integration issues using IRL principles. The subjectivity of TRL and IRL processes has been widely criticized through several journals.

493 Arguably, the literature has oversimplified facets of system readiness and maturity. The risk and effort required for higher readiness levels does not factor into the current assessment of TRL and IRL (Kujawski, 2013). However, the process of conducting an initial assessment of the system prior to the development and integration is beneficial in understanding overall feasibility. Defining feasibility of integration early enough in the process can impact decisions that could avoid integration pitfalls later in the process.

This research provides a tool to facilitate the integration of SoS; however, it does not focus on the development of systems integration requirements. Integration requirements are critical, but this research assumes that the stakeholders have agreed to the requirements before planning the scope that allows integration activities to be executed. Requirements need to be clearly defined to the lowest level and the assumption is to have clear requirements before conducting integration assessment(s). The proposed IRL assess- ment process will be iterative based on the changes in scope (if any) by the stakeholders. Understanding enterprise-wide integration requirements is necessary before developing a specific architectural framework to help with the integration process (Bolloju, 2009). Understanding integration requirements processes can help with traceability of derived requirements during the integration planning and implementation stages.

Theory IRL can be an effective systems integration assessment tool, and given the right multidimensional framework, it can facilitate the integration of SoS. Utilizing other integration variables and expanding the current notional definitions of IRL can significantly impact the assessment of SoS integration. IRL was also proposed as an intermediate step by mak- ing it part of a matrix function with TRL in order to determine the SRL (McConkie, Mazzuchi, Sarkani, & Marchette, 2013). When IRL is used as a function of SRL, IRL could be overlooked from being a significant inde- pendent assessment value, and the IRL level may be influenced by what is needed as the SRL value. Current IRL definitions can be useful as an input to the framework as opposed to an intermediate assessment conducted merely to understand technology or system maturity.

Other acquisition practitioners may determine that integration readiness can be assessed as part of the DoD acquisition community’s Technology Readiness Assessment (TRA) process (the official process to determine TRL score), but this process does not capture the purpose of integration. TRL was never intended to measure integration maturity (Magnaye et al., 2010).

494 A system with mature technology does not automatically equate to having a high IRL when interfacing with another system with mature technology.

DoD Space Systems continue to provide examples of complex SoS. With very limited opportunities to do operational tests and analyses for satellite sys- tems and rocket launches, space systems provide a platform to incorporate the latest technologies and processes to attain successful operational sys- tems. As part of space system complexity, space systems also factor in the need for expediting the insertion of cutting-edge technology while adapting to evolution of new technology and reducing qualification timelines (Pittera & Derrico, 2011). IRL can be used to assess the integration of these types of complex SoS given a rigorous process that accounts for other variables.

Literature Review Use of the IRL process has been documented through abundant lit- erature with different strategies of application. The most common use is an interim tool to support TRL and/or SRL processes. TRL metrics were initially developed from seven levels promulgated by the National Aeronautics and Space Administration in the late 1990s, and then to the nine levels that were adopted by DoD in 2002 (Mankins, 2009). Assessment of TRL became more necessary as technology assessment moved into complex SoS, and DoD acquisition programs needed tools to help make decisions in complex environments (McConkie et al., 2013). The original intent of IRL was to provide a systematic analysis of integrating different

495 technologies and understanding the maturity between the points of integration (Pittera & Derrico, 2011). IRL was focused on being a tool to understand the maturity of technologies used to integrate the systems as opposed to merely a tool to integrate two or more systems.

When the goal is to incorporate the use of IRL into systems integration processes, systems integration strategies and challenges must be identified. Systems integration is a strategy advocated in order to achieve sustainable development—it is very diverse and can be interpreted in many different ways (Vernay & Boons, 2015). System integration strategies have been explained in several journals, which provided different perspectives and helped narrow the focus of this research from supporting technical/system maturity to readiness of integration.

According to Gagliardi, Wood, Klein, and Morley (2009), “severe integration problems can arise due to inconsistencies, ambiguities, and gaps in how quality attributes are addressed in the underlying systems” (p. 12). One of the main reasons for this is not identifying the right quality attribute that supports the integration activities early in the process. This is where a set of quality attributes derived from the DoD Space Systems integration issues can be identified and leveraged to help with systems integration.

Application of systems integration processes can be enhanced when additional systems, processes, and organizations are involved in order to integrate SoS. Generally, SoS can be defined as a collection of interoperating components and systems producing results that are not achievable by each of the individual systems. Adding capabilities to an already complex system epitomizes the definition of SoS (Zhang, Liu, Wang, & Chen, 2012). To fully realize the analysis of SoS, engineering and systems architecture must be used to address allocation of functionality to component interaction as opposed to merely dealing with individual components (Ender et al., 2010).

To understand the properties of SoS, DoD Space Systems will be used as examples of complex SoS. DoD satellites are complex SoS that cannot be accessed for upgrade or changes after they have been placed into orbit. Conducting technical demonstration also requires significant funding and resources for these types of complex SoS (Dubos & Saleh, 2010). Thus, it is imperative that integrating them successfully the first time becomes a high priority.

To understand the comprehensive processes involved in the integration of complex SoS, one must understand how complex SoS are managed. Literature is replete with architectural framework proposals to design, develop,

496 and manage these types of systems (Suh, Chiriac, & Hölttä-Otto, 2015). Many of these larger systems involved multiple organizations as stake- holders, and according to Rabelo, Fishwick, Ezzell, Lacy, and Yousef (2012), “large scale organizations, which have significant operations management needs, seek out and execute complex projects that involve a very large number of resource and schedule requirements” (p. 112). In addition to the wide range of factors involved in complex SoS, recent SoS development requires intensive information technology software, which also triggers the use of an architectural framework (Piaszcyk, 2011).

Architectural frameworks are developed to illustrate operations and functions to help clearly define roles of entities within complex SoS. Understanding and implementing the right architecture and component design will improve performance, resulting in faster integration processes and reduced complexity (Jain, Chandrasekaran, Elias, & Cloutier, 2008). The resulting data from this research will provide the components needed to develop an architectural framework. This framework will help explain how major attributes will contribute to the assessment of enhanced IRL and will help illustrate the different factors needed for consideration during evaluation of IRL at certain, specific timelines.

Major attributes to support the integration assessment framework will be finalized based on the results of the data collection. As explained by Jain et al. (2008), “attribute is a property or characteristics of an entity that can be distinguished quantitatively or qualitatively by human or automated means; the attributes of a system or architecture are used to describe the properties of the system or system architecture in a unique distinguishing manner.” Attributes are usually basic elements related to requirements, which can be interpreted that a requirement consists of multiple attributes (Lung, Xu, & Zaman, 2007).

497 For attributes to be an effective tool for integration, they need to be inte- grated into the framework and optimized to balance complexity and simplicity. Organizations can adopt these attributes when integrating capa- bilities into the family of systems. Complex systems can have unpredictable behaviors, and using multiple attributes can help measure the performance of these behaviors by quantitatively and qualitatively weighing the individ- ual attributes (Gifun & Karydas, 2010).

The initial list of attributes that will be considered for this research is derived from discussions with space systems integration experts, journal articles on integration variables (Jain, Chandrasekaran, and Erol, 2010), and key integration process areas (Djavanshir & Khorramshahgol, 2007). Table 2 shows the list of attributes considered for purposes of this research.

TABLE 2. LIST OF ATTRIBUTES Commercial-off-the-Shelf System Availability Architecture and Design (COTS) Complexity Configuration Management Concept of Operations Hardware/Software Verification Documentation Functionality and Validation Internal/External Organizations Interface Control Information Assurance and Coordination Processes and Planning and Scheduling Programmatic and Risks Procedures Quality Assurance Requirements Management Resources (Funding) Risk Analysis and Semantic Consistency Standards Management Schedule Control Strategic Planning Technology Insertion Testing Training

These potential attributes will be used to help define each integration issue data set that is aligned to a specific integration variable. Some of these attributes may overlap or become part of another attribute. Collecting DoD Space Systems data will help identify and provide the criteria for the major attributes associated with each data set.

Goals The goal of this research was to collect data on six major DoD Space Systems integration issues over the past 16 years, align the data into items that impact integration, group the items into major attributes, and use the

498 attributes to develop a systems integration architectural framework to assess integration feasibility. After the integration items were grouped into major attributes, a survey was solicited from experienced systems engineers, systems integrators, space integrators, and DoD space acquisition person- nel. The goal of the survey was to verify the derived integration attributes and help determine how the attributes are weighted when used in the sys- tems integration assessment framework. Understanding the feasibility of integration ahead of time would help identify risks and mitigation strategies.

The goal of integrating desired capabilities into a family of DoD space/ satellite systems is defined notionally in Figure 1. The desired capability of DoD Space Systems is usually integrated into a sensor, which becomes the payload of a satellite. The payload is integrated into a satellite bus and both systems become the satellite system. The satellite system is integrated into ground systems for communications and data exchange. The ground systems are the satellite operations centers, ground antennas, tracking stations, and the locations that process the mission data transmitting from the satellite system.

FIGURE 1. NOTIONAL DOD SPACE SYSTEMS ARCHITECTURE AND MAJOR INTEGRATION POINTS

Family of Satellite Systems End Products & Capabilities

Ground Systems Desired Capabilities Payload/Sensor Satellite/Bus

Launch Vehicle

499 Due to its uniqueness, the integration of desired capabilities into the pay- load is usually the most challenging effort in development and integration of space systems. After the payload has been developed and integrated into the satellite bus, it is rigorously tested on the ground. From there, the sat- ellite system is integrated onto a launch vehicle (its ride to get into orbit). The satellite system is also integrated into the family of systems to ensure interoperability with existing systems and produce the final product/data to its users. The launch vehicle is integrated into the ground systems and the satellite system. The ground systems enable communication with all systems throughout launch, deployment, and operations of the satellite system. The desired end state or successful SoS integration includes suc- cessful launch and deployment of a satellite system in orbit that meets (or exceeds) performance of all systems involved.

The goal is to develop a systems integration assessment framework tool based on the attributes identified through data analysis. It is imperative that assessment of complex systems must address common interfaces, interop- erability, reliability, and operations (Bhasin & Hayden, 2008). Methods and processes in systems engineering are used for defining requirements and provide the framework to identify a generalized architecture (Bhasin & Hayden, 2008). One of the major goals is to understand how integra- tion issues can be addressed early in the acquisition process or planning stages to reduce the number of integration chal- lenges and complexities later in the systems engineering process (Jain et al., 2010).

Cost for systems integration can increase substantially, and implementing processes to understand enterprise-wide systems integration require- ments must be considered before deciding on a solution and architecture (Bolloju, 2009). Based on findings by Kruchten, Capilla,

500 and Duenas (2009), architectural elements are the primary constituents of describing “components and connectors, while nonfunctional properties shape the final architecture” (p. 36).

To improve integration of SoS, the understanding of integration issues must first be identified. Table 3 defines the process.

TABLE 3. PROCESS FROM DATA COLLECTION TO INTEGRATION ASSESSMENT FRAMEWORK DEVELOPMENT 1. Collect high level integration issues from six major DoD major Space Systems from 1999–2014 2. Identify the integration variable that is impacted by each integration issue 3. Group the integration variables based on common focus areas 4. The integration variables will define the parameters for the final set of attributes 5. Analyze how each attribute is distributed throughout all six major space systems in the 16 years of data sets 6. Complete regression analysis of the attributes and determine weights of each attribute 7. Validate attributes and the type of integration variable associated with each attribute by completing a survey of systems integration experts in DoD 8. Analyze survey results using regression and correlations 9. Compare both data analysis and survey analysis; identify final weight of each attribute 10. Based on the analysis, develop a systems integration framework tool with weighted attributes to help facilitate SoS integration process

Once the integration assessment framework is developed, it can be used to help facilitate SoS integration. The output of the framework is the IRL score for the entire SoS, and as each major attribute matures, the IRL score improves. The framework can be used throughout the design, development, and deployment of the SoS. The stakeholders or designated Subject Matter Experts (SMEs) provide upfront analysis and agree on what constitutes the initial IRL level for each of the major attributes. The initial IRL levels for each attribute will be the inputs to the framework. The initial IRL definition for each attribute will be identified from 1 to 9 (using Table 1 as a reference) once the criteria are defined for each major attribute.

Data Collection and Exploration Major Space Systems The integration data will be collected from six major DoD Space Systems. As defined in DoD Instruction 5000.02 (2017), the six space systems are considered “Major Defense Acquisition Programs (MDAP) Acquisition

501 Category I (ACAT I), which are defined as overall program procurement cost of over $2.79 billion (using Fiscal Year 2014 constant dollars)” (p. 49). The major space systems are: (a) Advanced Extremely High Frequency (AEHF) Satellite Communications (SATCOM); (b) Evolved Expendable Launch Vehicle (EELV); (c) Global Positioning Satellite (GPS); (d) National Polar- Orbiting Operational Environment Satellite System (NPOESS); (e) Space Based Infrared Systems (SBIRS); and (f) Wideband Global SATCOM (WGS).

For 16 years from 1999–2014, major integration reporting data from the Director, Operational Test and Evaluation (DOT&E), with additional GAO supporting documents, were collected (DOT&E, n.d.), which explained major integration issues encountered by the six major space programs for each year. The issues identified in the reports are explained, but additional background information was gathered from the U.S. Air Force Space & Missile Systems Center (SMC) program offices. Additional interviews of SMC personnel to gather background information were also conducted for better understanding of some of the issues. These space acquisition programs have encountered challenges in planning, designing, developing, integrating, testing, and deploying different systems throughout the 16-year reporting period. Each major issue identified per year was counted as one issue (i.e., if one issue persisted through 4 years, it will be counted as four issues). Table 4 presents the description and data collection summary of each of the six DoD space systems.

TABLE 4. SIX MAJOR SPACE PROGRAMS: 1999–2014

Description Data Collection Summary

1. AEHF (Advanced Extremely High There were 79 major integration issues Frequency) Satellite Communications identified for AEHF SATCOM during (SATCOM) is a communications satellite the 16-year span. Some of AEHF major SoS that provides secure and survivable integration issues included: testing communications capability to U.S. military processes for digital processors, during any level of conflict. The AEHF integrating nuclear hardening and capability ensures continuous secure shielding, electric propulsion, phased communication during all levels of conflict array antenna, and nuller spot beam. for warfighters.

2. EELV (Evolved Expendable Launch There were 24 major integration issues Vehicle) is an SoS launch vehicle and identified for EELV during the 16-year service provided to DoD and commercial span. Some of the major issues were customers. It was developed to provide data availability to the stakeholders launch capability that reduces the cost of for test verification processes, alternative launch options while meeting documentation of flight analysis plan, mission assurance requirements. and integration of several design processes.

502 TABLE 4. SIX MAJOR SPACE PROGRAMS: 1999-2014 (continued)

3. GPS (Global Positioning System) is a There were 71 major integration issues major SoS, providing data that both DoD identified for GPS during the 16-year and civilian users are dependent upon to span. Some of the major issues were precisely determine their velocity, time, and communications between cross-link position. The GPS constellation calls for 24 systems, development and integration satellites that provide real-time and highly of GPS Military-Code and civil signal accurate positioning and location data. capabilities, integration of receivers It can function through all weather and into major platforms (i.e., ships and provides passive data to worldwide users. aircraft), and refinement of end-to-end test strategy.

4. NPOESS (National Polar-orbiting There were 18 major integration issues Operational Environmental Satellite identified for NPOESS during its System) was an enhanced weather SoS existence. Some of the major issues program that was administered by three included field terminal integration different government organizations: DoD’s and interoperability, integration of Air Force; Department of Commerce quality environmental data record, (DoC)’s National Aeronautics and Space and integration challenges of Administration (NASA); and DoC’s National program sensor design and algorithm Oceanic and Atmospheric Administration development. (NOAA). NPOESS provided the platform to acquire satellite SoS to collect and disseminate environmental data for the three organizations. Although development planning and integration started in 1999, the program was eventually cancelled in 2011. There are many valuable integration lessons learned and information to be derived from the program throughout the years. 5. SBIRS (Space Based Infrared System) There were 46 major integration is a remote sensing satellite SoS that issues identified for SBIRS during supports DoD by providing unprecedented the 16-year span. Some of the major timely and improved infrared data quality. integration issues included defining SBIRS’s top missions of improving missile multiple test strategies, development warning and missile defense have been of Modeling & Simulation systems to realized through complex development support operations of integration and and integration processes. It has also testing, ground software development greatly enhanced the other mission areas and integration, and flight software of technical intelligence and battlespace integration into satellite. characterization.

6. WGS (Wideband Global SATCOM) SoS There were 33 major integration enables communication to U.S. military issues identified for WGS during the and Coalition partners and allies during 16-year span. Some of the major issues war and all levels of conflicts. Its wideband included integration complexity of capability provides greater bandwidth to X-band and Ka-band cross-banding transfer data during conflicts. for the on-board satellite, integration technical issues on orbital placement, frequency reuse, and launch services integration, interoperability complexity of control software development.

503 Attributes Derived from Data Collection To understand the impact of each major integration issue, each issue was aligned with an integration variable to best describe the major area impacted by the issue. The integration variables were then grouped together based on similar focus areas and aligned with their contributing attri- bute. Interviews with integration experts and additional research were conducted to scope the right number of attributes from the initial list. The definition of each potential attribute continued to be refined based on the different groups of integration variables supporting them. After identifying the integration issues and integration variables, the list of con- tributing attributes was finalized. Each integration variable was assigned to a major attribute, which also helped define the criteria for that attribute. A discussion of the resultant list of seven final attributes follows.

Availability. To enable integration activities, Availability is critical to understanding the benefits to be derived from accessibility of supporting hardware, software, documentation, and expertise. Integration of new systems into legacy systems may be made more difficult by improper or insufficient support documentation or the unavailability of expert per- sonnel/training expertise. This attribute is critical in identifying these supporting entities. The items derived from the integration issues that sup- port this attribute are access to high TRL supporting systems, subsystems and components, and access to supporting documentation and expertise.

The availability of the right supporting systems is critical to the success of the overall system delivery. To help address Availability, the reuse of existing infrastructure and the use of COTS items are identified before integration activities. To meet time commitments, the reuse of systems, subsystems, and components from the current infrastructure are highly encouraged with the possibility of replacing them with COTS items on future updates (Tyree & Akerman, 2005).

Complexity. Complexity is the attribute that manages the technical risks of integration. The items derived from the integration issues that support this attribute are managing low technical maturity of system design, development, and integration, and determining complex interfaces. According to Jain et al. (2008), “complexity can be defined as the degree to which a system or component has a design or implementation that is difficult to understand and verify” (p. 211). The complexity of systems integration has significant influence on driving schedule delays (Dubos, Saleh, & Braun, 2008). Complexity affects the degree of complication of several factors such

504 as the number of interfaces and intricacies of different components (Jain et al., 2008). Although it has been widely studied, there is no defined mathe- matical model for systems related to Complexity (Haghnevis & Askin, 2012).

Interoperability. Interoperability is the attribute that will address compatibility, connectivity, and communication between systems, sub- systems, and components. The items derived from the integration issues that support this attribute are understanding compatibility and interface issues, addressing semantic/syntactic issues, and managing communica- tions issues between systems. Interoperability assessments can be very challenging in SoS because of the different testing involved to verify their functionality (Lin et al., 2009). According to Madni and Sievers (2014), “interoperability is the ability of distinct systems to share semantically compatible information and then process and manage the information in semantically compatible ways, enabling users to perform desired tasks” (p. 342). Interoperability can be improved, which means that the metrics to measure it can be defined. However, Interoperability is still a complex and broad topic, and its condition may not be easily quantified (Rezaei, Chiew, & Lee, 2013).

Management. Management is the attribute most critical to planning and developing integration strategies. The items derived from the inte- gration issues that support this attribute include leading coordination between stakeholders, providing guidance and directives, determining scope, managing requirements, and developing and implementing policies and agreements. Management ensures that the stakeholders’ derived in- tegration requirements are met throughout the integration process. Man- aging stakeholders’ decisions is critical in shaping the final design of the system (Booch, 2006). Management must be able to navigate through the challenges of diminishing budgets, changing politics, and evolving tech- nologies to support integration activities.

The Management methods for integration address issues of philosophy, operation, and collaboration (Tien, 2008). Management addresses re- quirements traceability. As indicated by Piaszcyk (2011), “traceability is used to evaluate the impact of top-level requirements changes” that can help eliminate requirements creep (p. 325).

Processes. Processes is the attribute that enables the development, documentation, and execution of all integration activities. The items derived from the integration issues that support this attribute are processes that develop, document, and execute the following: testing, verification and validation, configuration management, training,

505 (M&S), M&S activities, information assurance, mission assurance, com- munications and decision making, security, safety, quality assurance, and manufacturing and assembly. Processes implemented during integration must be clearly defined and documented. Many organizations’ efforts in key process areas for integration did not have much success due to ill-defined Processes (Djavanshir & Khorramshahgol, 2007). Processes for integration include standards, procedures, and algorithms associated with integration activities (Tien, 2008).

The goal is to identify necessary Processes needed for successful integration and the impacts to the integration requirements. Great examples from the space integration data also explain how the integration issues are impacted due to the different test and verification Processes that have been imple- mented. Using M&S as part of the integration process will help minimize risks. Computer-based M&S early in the process can help thrust systems engineering methods throughout the integration process (Ender et al., 2010).

Resources. Resources is the attribute that enables integration of SoS through funding and providing the right integration tools. The items derived from the integration issues that support this attribute are sufficient funding to support all integration activities, providing the right amount of trained personnel, and providing the right tools and facilities throughout the integration activities. Resources planning throughout SoS integration is critical to overall success. During the early phase of integration, informa- tion is scarce, but SMEs are potential sources of information with valuable inputs to address life-cycle cost of development and overall integration of a system (Tien, 2008).

Schedule. Schedule is the attribute that manages Schedule impacts throughout the integration activities. The items derived from the integra- tion issues that support this attribute are understanding timeline goals (i.e., need dates, expected deliveries, and milestones), understanding sched- uling parameters, and understanding critical path and impacts to Schedule delays. Schedule can be impacted by several of the attributes mentioned; however, an independent assessment is needed to understand how different changes impact scheduled milestones and activities. Need date and allowed time of integration must be established early on with analysis for fluctu- ations and critical path. Usually, the more time given for integration, the better the IRL score, which can be attributed to system maturity through time and processes.

506 Schedule delay usually increases the overall cost of the system and may impact the current requirements (i.e. launching a satellite at a certain month/year as required). Based on Dubos et al. (2008), “schedule slippage plagues the space industry and is autonomic with the recent emphasis on space responsiveness; the GAO has repeatedly noted the difficulties encoun- tered by the DoD in keeping its acquisition of space systems on schedule, and they identified the low TRL of the system/payload under development as the primary culprit driving schedule risk and slippage” (p. 386).

Space Integration Data Using Derived Attributes Some of the attributes derived from the data may slightly impact the other attributes based on how they are interpreted. Therefore, the stake- holders need to establish clear definitions (as summarized previously), understand the distinctions between attributes, and document assumptions being made for each major attribute. Each integration issue is aligned with an integration variable and then grouped with one of the seven major attri- butes based on their focus areas. Tables 5a to 5f capture the top issues for each attribute from each space system. They show how a specific integration issue is aligned to an integration variable and how it becomes a subset to one of the seven attributes. Tables 5a to 5f also provide the number of years that the issue impacted each of the space systems.

507 TABLE 5A. TOP ISSUES FOR EACH ATTRIBUTE IN AEHF SATCOM

AEHF Satellite Communications (SATCOM) Integration Contributing # of Yrs Integration Issue Variable Attribute Impacted Nuclear hardening and shielding System Design Complexity 6 integration into system Maturity Lack of terminal synchronization Communications Interoperability 4 within system between Systems Planning of space, mission control, Strategy Management 3 and user segment integration & Development synchronization strategy

Testing anti-jam capabilities; Test Process Processes 3 test processes to evaluate AEHF capabilities

Fidelity of available system test Supporting Availability 2 simulator System Availability Schedule issues by development Schedule Impacts Schedule 1 contractor to resolve first-time test article and test fixture problems

Insufficientbudget to fund Insufficient Resources 1 additional filters to meet High- Funding Altitude Electro Magnetic Pulse (HEMP) testing and certification

TABLE 5B. TOP ISSUES FOR EACH ATTRIBUTE IN EELV

Evolved Expendable Launch Vehicle (EELV) Integration Contributing # of Yrs Integration Issue Variable Attribute Impacted Design issues on reliability, logistics, System Design Complexity 2 supportability, multiple payload Maturity interfaces, and information assurance Lack of contractual requirement for Contracted Management 2 test reporting Agreement and Policies Insufficientprocesses to obtain Verification Processes 2 data of system qualification tests Processes for verification Lack of additional development Insufficient Resources 1 and integration funding to satisfy Funding government and industry launches Availability of modified ground Supporting Availability 1 system System Availability Integration of legacy upper stage System Interfaces Interoperability 1 to the new launch vehicle design and Compatibility No major integration issue identified that the Schedule attribute impacted

508 TABLE 5C. TOP ISSUES FOR EACH ATTRIBUTE IN GPS

Global Positioning System (GPS) Integration Contributing # of Yrs Integration Issue Variable Attribute Impacted Ongoing issues with the integration of Communication Interoperability 8 GPS receivers into different platforms between (ships, aircraft, etc.) Systems Availability of M-code and civil signal Supporting Availability 7 capabilities through development and System integration Availability Planning and refinement of integration Planning Test Management 7 and test strategy; insufficient rigorous Strategy end-to-end planning Configurationmanagement process Configuration Processes 3 issues on operations control system Management development and integration Process Schedule delays impacting integration, Schedule Schedule 3 tests, and supporting resources Impacts Design issues on Control Software System Design Complexity 1 Integration Maturity Funding availability to system Insufficient Resources 1 development and integration causing Funding event delays

TABLE 5D. TOP ISSUES FOR EACH ATTRIBUTE IN NPOESS

National Polar-Orbiting Operational Environment Satellite System (NPOESS) Integration Contributing # of Yrs Integration Issue Variable Attribute Impacted Design issues with the quality of System Design Complexity 2 environmental data record on algorithm, Complexity sensor performance, quality control of interface data processing segment Insufficientinteroperability of field System Interoperability 2 terminals and ground systems Interoperability Funding issues to support Insufficient Resources 1 interoperability of high rate data in X Funding Band and low rate data in L Band Inconsistent agreement and definitions Stakeholder Management 1 of different decision makers’ roles and Agreements responsibilities Lack of documentation process of Documentation Processes 1 adequate threshold definitions in data Processes terminals Lack of available space environment Supporting Availability 1 sensors that can be used for integration, Subsystem test, and upgrades Availability No major integration issue identified that the Schedule attribute impacted

509 TABLE 5E. TOP ISSUES FOR EACH ATTRIBUTE IN SBIRS

Space Based Infrared System (SBIRS) Integration Contributing # of Yrs Integration Issue Variable Attribute Impacted Insufficient definition of integrated Strategy Management 6 test strategy to meet current schedule, Development ground architecture, and overall systems operational requirement Lack of tools and resources to support Insufficient Tools Resources 3 M&S development of operational and Resources integration and testing Availability of ground software Supporting Availability 3 delivery for integration (delayed in Software development) Availability Degraded test data from Sensor Test Assembly Processes 3 Integration Lab due to Infrared Starer Process sensor and telescopic assembly process Communication issues of space control Communications Interoperability 3 and telemetry, and flight software between integration in satellite Systems Compressed schedules to accreditation Schedule Schedule 2 of operational test scenarios increased Impacts software integration issues Software instability in tracking, Software Design Complexity 1 telemetry, and control functions Maturity

510 TABLE 5F. TOP ISSUES FOR EACH ATTRIBUTE IN WGS

Wideband Global SATCOM (WGS) Integration Contributing # of Yrs Integration Issue Variable Attribute Impacted Complexity of designing and System Design Complexity 4 integrating effective cross-banding Maturity between X-band and Ka-band onboard the satellite Lack of software tools to support Insufficient Tools Resources 2 development of system configuration control element System interoperability issues of WGS System Interoperability 2 and transmission system Interoperability Insufficientdocumentation processes Document Processes 1 to support development and Process integration Ill-defined requirement definitions to Strategy Management 1 separate legacy and new systems on Development orbit and Requirements Management Insufficientschedule assessment Schedule Schedule 1 to account for development and Impacts integration Lack of upgraded automation Supporting Availability 1 software operations center networks System Availability

The result of the data collection based on the attributes gathered from the six major DoD Space Systems is summarized in Table 6. The total number of major integration issues collected from the six DoD space systems for the 16 years is 271 with a total of 35 for Availability, 51 for Complexity, 64 for Interoperability, 37 for Management, 52 for Processes, 16 for Resources, and 16 for Schedule.

511 Schedule 1 0 1 0 1 3 1 1 1 1 2 0 3 1 3 0 3 3 0 2 0 2 0 1 0 1 0 0 0 0 0 1 16 16 Resources 1 1 2 2 2 5 5 3 3 8 8 8 4 0 0 0 52 Processes 1 1 1 2 2 2 3 3 5 3 3 3 4 4 0 0 37 Management 1 1 2 2 3 8 8 9 6 6 6 4 4 4 0 0 64 Interoperability 1 1 7 2 7 2 7 5 9 0 0 0 0 0 0 51 10 Complexity 1 1 1 2 2 2 3 3 3 5 4 4 4 0 0 0 35 271 Availability TABLE 6. SUMMARY OF INTEGRATION DATA: SIX DOD MAJOR SPACE SYSTEMS (1999–2014) SYSTEMS SIX DOD MAJOR SPACE DATA: OF INTEGRATION 6. SUMMARY TABLE 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 TOTALS TOTAL (MajorTOTAL Integration Issues/Year) Total Major Integration Issues

512 Data Analysis Regression analysis was completed to assess the current data and how the attributes impacted the different years. The summary of analysis is highlighted in the Normal Probability Graph shown in Figure 2.

The resulting numbers are provided in Tables 7a-d, with Table 7d providing the regression equation of the relationship between the attributes and the 16 years defined.

FIGURE 2. NORMAL PROBABILITY GRAPH OF 7 ATTRIBUTES DERIVED FROM COLLECTED DATA

99

95 90 80 70 60 50

PERCENT 40 30 Variable Availability 20 Complexity Interoperability 10 Management Processes 5 Resources Schedule 1 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0 12.5 DATA

513 TABLE 7A. SUMMARY STATISTICS

Standard Variable N Mean Minimum Minimum Deviation Availability 16 2.1875 1.6008 0 5 Complexity 16 3.1875 3.6555 0 10 Interoperability 16 4 2.9212 0 9 Management 16 2.3125 1.4477 0 0 Processes 16 3.25 2.8402 0 8 Resources 16 1 1.1547 0 3 Schedule 16 1 1.0328 0 3

TABLE 7B. ANALYSIS OF VARIANCE Source DF P-Value Regression 7 0.1177 Schedule 1 0.9536 Resources 1 0.3814 Processes 1 0.2031 Management 1 0.5817 Interoperability 1 0.7945 Complexity 1 0.1569 Availability 1 0.1393

TABLE 7C. MODEL SUMMARY

S R-sq R-sq(adj) 3.6838 68.07% 40.13%

TABLE 7D. REGRESSION EQUATION

Year = 15.140 − 0.070 Schedule − 2.183 Resources + 2.341 Processes − 0.5640 Management − 0.290 Interoperability − 1.5063 Complexity − 2.162 Availability

The highest mean is Complexity, which also had the highest standard deviation. The lowest average is 1.0 for both Resources and Schedule, but with a lower standard deviation on Schedule. The null hypothesis in this case is that all the attributes have a relationship with Year. Based on the

514 analysis, the P-Value for each attribute is greater than 0.05; therefore, all attributes have an impact on the Year. For statistical measure, R-squared helps explain how close the data are to the fitted regression line. The good- ness of fit (R-squared) shows that it is 68.0 percent, which is just below the acceptable 75 percent value. This shows that there is correlation between the variables and the Year. A modified version of the R-squared value is the Adjusted-R-squared, which accounts for the number of predictors in the model. In this case, the Adjusted-R-squared significantly drops the R-squared value. The Regression Equation helps with predicting attributes for the upcoming years.

Validation through Survey The integration data were very useful in identifying critical attributes to support a framework for facilitating SoS integration. SoS integration is complex, and the ability to develop a systems integration assessment framework tool to determine integration feasibility is necessary for system integration planning. Although the previously described seven attributes can help improve the integration feasibility assessment, they need to be validated by integration experts.

Integration of new systems into legacy systems may be made more difficult by improper or insufficient support documentation or the unavailability of expert personnel/training expertise.

Therefore, a survey was conducted to help validate the usefulness of the seven major integration attributes. The survey lasted 4 weeks and targeted 159 systems integration experts with varying experiences in systems engi- neering, DoD acquisition, space systems, and/or complex SoS. From the 159 integration experts surveyed, 88 responded, which helped validate the attributes derived from the data of the six major space systems (Table 4). The 88 participants were systems engineers, DoD acquisition professionals, and SMEs who primarily have backgrounds in space systems.

515 The survey asked the participants to list specific systems on which they worked (current and prior experience), their specific field (systems engi- neering, DoD acquisition, systems integration etc.), and years of experience in their chosen field of work. Once their credentials had been provided, they were asked to rank the attributes from one to seven—one being the most important; the respondents were then asked if they agreed with the seven attributes. They were also given the opportunity to provide additional attributes, and to explain if they agreed or did not agree with the list of attributes and the integration items supporting each attribute. They were asked to provide examples on how useful an SoS integration framework using IRL principles would be in helping with integration activities. Finally, a set of questions was asked to provide examples based on their experience on major integration risk areas, integration timelines, and potential use- fulness of IRL.

Summary of Survey Results There were a total of 18 questions in the survey; some questions, how- ever, were for clarification of responses to prior questions. The survey participants were a very experienced group, with an average of 19.4 years of experience in their respective field. Of the 88 participants, only seven said they were not in complete agreement with the attributes; however, their con- cerns were all addressed throughout the research process and through the updated definition for each attribute. The concerns were: exclusion of man- agement guidance (addressed under Management); exclusion of concept of operations, which is developed as part of the scope of integration (also a Management task); exclusion of M&S tools (addressed under Processes); exclusion of overall planning (addressed under Management); exclusion of requirements (which is outside the scope of this research, but addressed in Problem Focus); and exclusion of risk areas (technical risks were addressed in Complexity and overall program risks are the output of the systems inte- gration assessment framework). Most of the survey participants agreed that given the right supporting tools, an enhanced IRL assessment would be useful. Based on the results of the initial research (literature reviews and interviews with system integration experts), data collection, and expert survey, the final criteria for each attribute are summarized in Table 8.

516 TABLE 8. CRITERIA FOR EACH ATTRIBUTE USING DERIVED INTEGRATION VARIABLES

Major Integration Variables (Criteria to Major Attributes) Attributes Access high TRL supporting systems, subsystems, and components (i.e., COTS, operational ground systems, operational simulators) Access overall expertise and supporting documentation to execute Availability integration activities Access supporting hardware and software to execute integration activities (i.e., COTS, Hardware/Software reuse) Manage technical risks and low maturity of system design, development, and integration Complexity Determine complex interfaces between systems, subsystems, and components Manage low TRL supporting systems, subsystems, and components Understand compatibility and connectivity of systems, subsystems, and components; inputs into integrated systems produce desired outputs Manage communications between systems, subsystems, and components; Interoperability address semantic/syntactic issues Develop and document interface control documents Plan and develop integration strategies and priorities Lead coordination, execution, and communication between stakeholder organizations Provide guidance and directives to integration activities Management Determine scope of integration activities and concepts of operation Manage requirements changes (i.e., eliminate/minimize requirements creep) Manage support to integration activities and changes (i.e., supply chain) Develop and document policies and agreements with all stakeholders Document overall integration activity processes (i.e., overall communication and decision-making processes) Develop, document, and execute risks/mitigation processes Develop, document, and execute configuration management processes Develop, document, and execute information assurance and mission assurance processes Develop, document, and execute standards, procedures, and algorithms Processes Develop, document, and execute M&S activities Develop, document, and execute test, verification, and validation processes Develop, document, and execute security and safety considerations Develop, document, and execute Training processes Develop, document, and execute manufacturing and assembly processes Allocate sufficient funding to support all integration activities Provide trained personnel to support integration activities Resources Provide the right tools and facilities to support integration activities Meet timeline goals (i.e., need dates, expected deliveries, milestones etc.) Understand schedule impacts, parameters, and critical path Schedule Plan for schedule changes and impacts of delays

517 All 88 participants ranked the seven attributes in order. Table 9 captures the first and last five survey entries. The ranking of the attributes is distributed by having the No. 1 ranked attribute equal to 7/7 (1.0). Values for the subse- quent ranking: No. 2 is 6/7 (or 0.857), No. 3 is 5/7 or (0.714), No. 4 is 4/7 (or 0.571), No. 5 is 3/7 (or 0.429), No. 6 is 2/7 (or 0.286), and No. 7 is 1/7 (or 0.143)

TABLE 9. SURVEY RESULTS: YEARS OF EXPERIENCE AND RANKING OF ATTRIBUTES (PART I)

Years # Availability Complexity Interoperability Management Processes Resources Schedule of Exp. 1 39 1.000 0.857 0.714 0.571 0.429 0.286 0.143 2 35 0.143 0.714 0.571 0.286 1.000 0.857 0.429 3 40 0.571 0.714 0.857 1.000 0.429 0.286 0.143 4 34 0.714 0.857 1.000 0.429 0.571 0.286 0.143 5 16 0.857 0.143 0.286 0.714 0.571 1.000 0.429 • • • • • • • • • • • • • • • • • • • • • • • • • • • 84 26 1.000 0.286 0.857 0.143 0.571 0.714 0.429 85 27 0.286 0.429 0.571 1.000 0.857 0.714 0.143 86 14 0.571 0.857 0.714 1.000 0.143 0.286 0.429 87 35 0.286 0.429 0.143 1.000 0.571 0.857 0.714 88 8 1.000 0.571 0.143 0.857 0.429 0.286 0.714

Survey Analysis Weights of the attributes will be determined using the years of expe- rience and the ranking of each attribute. The Years of Experience is set up where the values are a lot higher than the value of ranking the attributes. To reduce the skewness of the dependent variable (Years of Experience), the log function will be used for the regression analysis. The log function will help normalize the Years of Experience relative to the scores given to attri- bute rankings. The results of the survey showed that almost all participants were in agreement with the derived attributes.

The summary of analysis is highlighted in the Normal Probability Graph in Figure 3. The resulting numbers are provided in Tables 10a to 10d, with 10d providing the Regression Equation for the Log Years and the attributes. The ranking of each attribute differed with each person, but there was a clear emphasis on the two highest attributes (Management and Resources) as indicated in Table 10a as well as two of the lowest ranked attributes in Schedule and Interoperability. Based on the analysis, the P-Values for all attributes are slightly greater than 0.05; therefore, the null hypothesis for

518 each attribute impacting the Log Years of Experience is true. The goodness of fit (R-squared) is a very low 18.84 percent, which is expected due to the ranking of each variable where no variable has the same value per sample.

FIGURE 3. NORMAL PROBABILITY PLOT FOR ALL ATTRIBUTES BASED ON SURVEY

99.9

99.0

95.0 90.0 80.0 70.0 60.0 50.0

PERCENT 40.0 30.0 20.0 Variable Availability 10.0 Complexity 5.0 Interoperability Management Processes 1.0 Resources Schedule 0.1 -5.0 0.0 0.5 1.0 1.5 DATA

TABLE 10A. SUMMARY STATISTICS Variable N Mean Standard Deviation Minimum Minimum Availability 88 0.59569 0.2618 0.143 1 Complexity 88 0.55038 0.27935 0.143 1 Interoperability 88 0.44165 0.27522 0.143 1 Management 88 0.69642 0.28298 0.143 1 Processes 88 0.58276 0.26971 0.143 1 Resources 88 0.69963 0.25802 0.143 1 Schedule 88 0.43672 0.25645 0.143 1

519 TABLE 10B. ANALYSIS OF VARIANCE Source DF P-Value Regression 7 0.0161 Availability 1 0.0883 Complexity 1 0.0716 Interoperability 1 0.0503 Management 1 0.1185 Processes 1 0.0697 Resources 1 0.109 Schedule 1 0.0876

TABLE 10C. MODEL SUMMARY S R-sq R-sq(adj) 0.283784 18.84% 11.74%

TABLE 10D. REGRESSION EQUATION Log Years = −5.925 + 1.776 Availability + 1.857 Complexity + 2.051 Interoperability + 1.601 Management + 1.888 Processes + 1.654 Resources + 1.758 Schedule

Comparing Results Evaluating the mean and standard deviation for each attribute for both the integration data analysis and the survey analysis will determine the weight for each variable to be used in the integration feasibility framework. Tables 11a to 11c illustrate the derived weight of each attribute.

Based on the percentage for each attribute, there are a lot of disconnects between the integration data analysis and the expert survey analysis on how the attributes should be weighted. The closest numbers for the Data and Survey are Availability at about 12–14 percent and Management at about 13–17 percent. Using the standard deviation, the High and Low for the Means have also been calculated. The percentages for the High and Low values show significant variances. In this case, using the average of two Means will provide a better estimate to use as the final weight of each attribute (Table 11c).

520 Systems Integration Assessment Framework The integration requirements are addressed through an architectural framework, which according to Jain et al. (2010), “includes both the physi- cal and functional architectures of the system” (p. 279). Based on the data analysis, subsequent attributes, and the survey analysis to validate the weights of the attributes, a systems integration assessment framework to determine an enhanced IRL value has been developed. When the models of a system are related through transformation, the collective frames can be referred to as a framework (Lin et al., 2009). In this case, the items that were derived from the integration data (and identified by the stakeholders) were used to support the major attribute.

TABLE 11A. MEAN AND STANDARD DEVIATION FOR INTEGRATION DATA AND SURVEY RESULTS

Integration Comparing Integration Survey, Survey, Standard Data, Standard Analyses Data, Mean Mean Deviation Deviation Availability 2.188 1.601 0.596 0.262 Complexity 3.188 3.656 0.550 0.279 Interoperability 4.000 2.921 0.442 0.275 Management 2.313 1.448 0.696 0.283 Processes 3.250 2.840 0.583 0.270 Resources 1.000 1.155 0.700 0.258 Schedule 1.000 1.033 0.437 0.256

TABLE 11B. PERCENTAGE OF EACH MEAN AND STANDARD DEVIATION FOR INTEGRATION DATA AND SURVEY RESULTS Integration Integration Integration Survey, Survey, Survey, Weight Data, Mean Data, High Data, Low Mean High Low Availability 12.9% 12.0% 25.7% 14.9% 14.6% 15.7% Complexity 18.8% 21.7% -20.5% 13.7% 14.1% 12.8% Interoperability 23.6% 21.9% 47.2% 11.0% 12.2% 7.9% Management 13.7% 11.9% 37.8% 17.4% 16.6% 19.5% Processes 19.2% 19.3% 17.9% 14.6% 14.5% 14.8% Resources 5.9% 6.8% -6.8% 17.5% 16.3% 20.8% Schedule 5.9% 6.4% -1.4% 10.9% 11.8% 8.5%

521 TABLE 11C. DERIVED WEIGHT OF EACH ATTRIBUTE

Average Weight Mean Total Final Weight Availability 27.8% 13.9% Complexity 32.6% 16.3% Interoperability 34.6% 17.3% Management 31.1% 15.5% Processes 33.7% 16.9% Resources 23.4% 11.7% Schedule 16.8% 8.4%

In architectural framework, understanding how the problem is framed is the most important step because it helps define the subsequent steps. Frameworks and standards define what should be modeled as opposed to which models should be used and how these models are related to one another (Zalewski & Kijas, 2013). According to Martin (2008), “problem framing techniques have been found to be useful in avoiding rework and maximizing results” (p. 306). Architectural approaches include conveying options, change, and implications, as well as identifying easier traceability by providing agile documentation (Tyree & Akerman, 2005). The change, based on initial IRL scores to enhanced IRL scores, supports this approach. Architecture analysis includes a key foundation that provides consistency, data completeness and transformation, lack of ambiguity, and flexibility to support iterative processes (Ge, Hipel, & Chen, 2014). Static Analysis is used in which, according to Ge et al. (2013), helps leverage “static architec- tural models for analyzing and comparing the associations captured from the data elements” (p. 370).

The resulting systems integration assessment framework to facilitate SoS integration is illustrated in Figure 4.

In the framework, one of the initial steps is to determine the IRL value of the integration items supporting each attribute by using the current IRL definitions as guidelines. Tables 12a and 12b provide guidance on how initial IRL for each attribute is defined.

522 Validate IRL Score; necessary options to identify risk identify adjust scope if scope adjust IRL SCORE ENHANCED areas; determine areas; (8.4%) (11.7%) (17.3%) (15.5%) (13.9%) (16.3%) (16.9%) Schedule Processes Resources Complexity Availability Management Interoperability Funding; Personnel; Funding; Right tools and facilities tools Right Timeline goals; Schedule Access supporting HW/SW Access Connectivity; Compatibility; Connectivity; Planning; Determine scope, Planning; Determine Low TRL supporting systems Low integration activity processes activity integration parameters; Impacts of Delays Impacts parameters; Develop, Document, and Execute and Execute Document, Develop, Communication between systems between Communication to expertise and documentation; and documentation; expertise to strategies and priorities; Guidance strategies & Directives; Manage Requirements & Directives; Technical Risks, Complex interfaces, interfaces, Risks, Complex Technical High TRL Supporting systems; Access Access High TRL Supporting systems; Initial IRL Assessment Tables 12a and 12b Tables FIGURE 4. SYSTEMS INTEGRATION ASSESSMENT FRAMEWORK FRAMEWORK ASSESSMENT INTEGRATION FIGURE 4. SYSTEMS each Attribute using IRL each Attribute principles is summarized in principles is summarized IRL Assessment Guideline to Guideline to IRL Assessment Activities Integration Scope of Scope Integration DECISIONS STAKEHOLDER New Systems Family of Family Capabilities Integration Requirements

523 TABLE 12A. GUIDLINES TO INITIAL IRL FOR EACH ATTRIBUTE USING IRL PRINCIPLES (PART I) IRL IRL IRL IRL IRL Availability Level Principles Complexity Interoperability 1 Interface All currently available Complexity Interfaces and has been and unavailable and technical connectivity identified supporting risk areas areas of systems, integration items of design, subsystems, and (systems, subsystems, development, component items components, expertise, and integration are identified and documentation) are are identified identified 2 Interaction Available and Technical Interaction has been unavailable items are risk areas are between items is specifically characterized characterized characterized characterized 3 There is Initial plans to obtain Initial Interoperability compatibility unavailable items are mitigation strategies are compatible with overall strategies are compatible strategy compatible to with integration technical risk activities areas 4 Sufficient Plans to obtain Sufficient Interoperability detail in unavailable items detail for strategies have quality and provide sufficient mitigation sufficient details assurance details strategies is defined 5 Sufficient Plans are delivered and Proposed Proposed control have sufficient control mitigation strategies are strategies are delivered and delivered and have sufficient have sufficient control control 6 Accept, Proposed plans are Mitigation Proposed translate, and accepted and executed strategies are strategies are structure accepted accepted and information executed 7 Verified and Plans are verified and Mitigation Proposed Validated validated strategies are strategies are verified and verified and validated validated 8 Mission- All items are delivered Mitigation Proposed Qualified and mission-qualified strategies strategies are are mission- mission-qualified qualified 9 Mission- All items are mission- Mitigation Proposed Proven proven strategies are strategies are mission-proven mission-proven

524 TABLE 12B. GUIDLINES TO INITIAL IRL FOR EACH ATTRIBUTE USING IRL PRINCIPLES (PART II) IRL IRL IRL IRL IRL IRL Schedule Level Definition Management Processes Resources 1 Interface Management All integration All resources Detailed has been strategies, activity and funding integration identified scope, processes are risks to schedule requirements, identified support and schedule and priorities integration risks are are identified are identified identified 2 Interaction Key Processes are All resources All schedule has been management characterized and funding items and specifically decisions are risks are risks are characterized characterized characterized characterized 3 There is Current Processes are Initial Initial compatibility strategies are documented mitigation mitigation compatible and are strategies strategies with compatible to funding to schedule management with risks are risks are decisions management compatible compatible strategies with with integration integration strategy strategy 4 Sufficient Management Processes Mitigation Mitigation detail in strategies have are sufficient strategies strategies quality and sufficient detail detail in have have assurance in quality and quality and sufficient sufficient assurance assurance detail in detail in quality and quality and assurance assurance 5 Sufficient Management Processes are Mitigation Mitigation control strategies are documented strategies strategies implemented with sufficient are delivered are delivered with sufficient control with with control sufficient sufficient control control 6 Accept, Management Processes are Mitigation Mitigation translate, and strategies, accepted and strategies for strategies structure goals, and executed funding risks for schedule information scope are are accepted risks are accepted and and executed accepted and executed executed 7 Verified and Management Processes are Mitigation Mitigation Validated strategies are verified and strategies are strategies are verified and validated verified and verified and validated validated validated 8 Mission- Management Processes Proposed Mitigation Qualified strategies are mission- strategies strategies are mission- qualified are mission- are mission qualified qualified qualified 9 Mission- Management Processes Mitigation Proposed Proven strategies are mission- strategies strategies are mission- proven are mission- are mission- proven proven proven

525 Approaches to determine IRLs include: (a) individual estimation by an SME; (b) group discussion estimation through meeting or conference; and (c) indi- vidual-group estimation where SMEs complete independent estimations and then discuss them with a group for a consensus decision (Tan, Ramirez- Marquez, & Sauser, 2011). Some of the current techniques that can be used to formulate the estimate of IRL assessment also include educated guess (not enough knowledge and not enough time), analogy (comparing prior work), and standards (developed within different organizations) (Tan et al., 2011). Using the framework for systems integration will be an iterative process.

The process of using the Systems Integration Assessment Framework, consisting of 11 steps, is as follows:

1. SoS integration begins with the concept of integrating new capabilities to either an existing Family of Systems or a new SoS.

2. Integration requirements are written clearly and approved by the stakeholders.

3. The Scope of Integration (i.e., cost, schedule, performance, integration planning, Concept of Operations etc.) are deter- mined by the stakeholders.

4. Based on the Scope, the integration activities can either pro- ceed (if integration has been done before) or assess the overall feasibility of the Scope of Integration.

5. To begin understanding the overall feasibility of integration, the stakeholders and designated SMEs determine IRL score for all possible integration items supporting each major attribute.

6. Using IRL principles (as described in Tables 12a and 12b), determine the initial IRL value for each attribute.

526 7. Use the derived weighted percentage to determine the

enhanced IRL score for the SoS (IRLSoS). The resulting equa- tion becomes:

IRLSoS = (IRLAvailability x 0.139) + (IRLComplexity x 0.163) +

(IRLInteroperability x 0.173) +(IRLManagement x 0.155) + (IRLProcesses x 0.169) +

(IRLResources x 0.117) + (IRLSchedule x 0.084)

8. The Enhanced IRL score will determine integration feasibility.

9. Stakeholders evaluate and validate the enhanced IRL score, identify risk areas, and determine options to adjust the scope of integration, if necessary.

10. Determining integration feasibility is an iterative process, and this framework allows the stakeholders to look at options to improve the scope when new information is obtained.

11. Assessment of IRL can also be tied to major events and reviews throughout the development and deployment processes. It allows program managers to make adjustments to their programs based on the results of the IRL assessment.

Conclusions Based on the research, integration data collected from the six major DoD Space Systems, and expert survey, seven attributes were derived in order to develop a framework that can facilitate SoS integration. A systems integration assessment framework tool provides an enhanced IRL score that will help determine integration feasibility in planning and facilitating integration activities. Based on a survey of experienced systems engineers, systems integrators, and DoD space acquisition personnel, the attributes were necessary to help plan and implement integration processes. The attri- butes were validated through the expert survey, and the results proved that other attributes were more important than those initially defined.

The attributes with their derived weights are the primary entities for the integration assessment framework tool. The tool can be used as many times as new information is provided, and adjusting the expectations and scope can help with understanding the feasibility of the IRL score. Calculating the enhanced IRL score is an iterative process with the thought that the IRL

527 score could improve through time due to additional information and adjust- ments to each attribute. Although the current DoD process of deploying space system capabilities for operational use does not require assessment of integration maturity, the result of this research should help quantify the feasibility of integrating SoS using IRL principles.

Limitations and Future Work Although several processes have been provided to assess TRL and IRL, there is still some subjectivity to these analyses. Every program has some uniqueness on how to integrate its systems, but the past data and past expe- rience can help put some rigor in the analysis. IRL has effective principles if it complements other analyses and helps with program manager decisions. Results can offer areas of emphasis as opposed to detailed solutions. For IRL to be used effectively, high-level assumptions need to be clear, upfront, and agreed to by stakeholders.

When grouped with other readiness metrics, IRL may have some applica- tions in DoD research and development (Ross, 2016), but IRL itself does not have the capacity to critically assess research and development efforts. It also does not evaluate cost and schedule (Sauser et al., 2010). A more quan- titative algorithm may be required for the assessment of IRL for complex systems. Without rigorous criteria, assessment for the determining IRL could lead to inaccurate analysis. Assessment of TRL and IRL sometimes can lead to oversimplification (Ramirez-Marquez & Sauser, 2009).

Although some of IRL may be subjective, there are several analyses using IRL principles that can be done to develop processes and help with program managers’ decisions on SoS integration. Throughout DoD acquisition and large complex programs, integration challenges will continue to persist, and the need for additional analysis and tools to overcome those challenges is real and necessary.

528 References Bhasin, K., & Hayden, J. (2008, June). Architecting network of networks for space system of systems. Paper presented at 2008 IEEE International Conference on System of Systems Engineering, Monterey, CA. Bolloju, N. (2009). Conceptual modeling of systems integration requirements. IEEE Software, 26(5), 66–74. Booch, G. (2006). On architecture. IEEE Software, 23(2), 16–18. Department of Defense. (2017). Operation of the Defense Acquisition System (DoDI 5000.02). Washington, DC: Author. Director, Operational Test & Evaluation. (n.d.). Annual reports. Retrieved from http:// www.dote.osd.mil/annual-report/index.html Djavanshir, G., & Khorramshahgol, R. (2007). Key process areas in systems integration. IT Professional, 9(4), 24–27. Dubos, G., & Saleh, J. (2010). Risk of spacecraft on-orbit obsolescence: Novel framework, stochastic modeling, and implications. Asta Astronautica, 67(1–2), 155–172. Dubos, G., Saleh, J., & Braun, R. (2008). Technology readiness level, schedule risk, and slippage in spacecraft design. Journal of Spacecraft and Rockets, 45(4), 836–842. Ender, T., Leurck, R., Weaver, B., Miceli, P., Blair, W., West, P., & Mavris, D. (2010). Systems-of-systems analysis of ballistic missile defense architecture effectiveness through surrogate modeling and simulation.IEEE Systems Journal, 4(2), 156–166. Gagliardi, M., Wood, W. G., Klein, J., & Morley, J. (2009). A uniform approach for system of systems architecture evaluation. Crosstalk, 22(3–4), 12–15. Ge, B., Hipel, K., & Chen, Y. (2014). A novel executable modeling approach for system- of-systems architecture. IEEE Systems Journal, 8(1), 4–13. Ge, B., Hipel, K., Kewei, Y., & Chen, Y. (2013). A data-centric capability-focused approach for system-of-systems architecture modeling and analysis. Systems Engineering, 16(3), 363–377. Gifun, J. F., & Karydas, D. M. (2010). Organizational attributes of highly reliable complex systems. Quality and Reliability Engineering International, 26(1), 53–62. Government Accountability Office. (2011). Space acquisitions: DoD delivering new generations of satellites, but space acquisition challenges remain (Report No. GAO-11-590T). Washington, DC: U.S. Government Printing Office. Government Accountability Office. (2012). Urgent warfighter needs: Opportunities exist to expedite development and fielding of joint capabilities (Report No. GAO- 12-385). Washington, DC: U.S. Government Printing Office. Haghnevis, M. & Askin, R. (2012). A modeling framework for engineered complex adaptive systems. IEEE Systems Journal, 6(3), 520-530. Jain, R., Chandrasekaran, A., Elias, G., & Cloutier, R. (2008). Exploring the impact of systems architecture and systems requirements on systems integration complexity. IEEE Systems Journal, 2(2), 209–223. Jain, R., Chandrasekaran, A., & Erol, O. (2010). A systems integration framework for process analysis and improvement. Systems Engineering, 13(3), 274–289. Kruchten, P., Capilla, R., & Duenas, J. (2009). The decision view’s role in software architecture practice. IEEE Software, 26(2), 36–42.

529 Kujawski, E. (2013). Analysis and critique of the system readiness level. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 43(4), 979–987. Lin, C., Lu, S., Fei, X., Chebotko, A., Pai, D., Lai, Z., … Hua, J. (2009). A reference architecture for scientific workflow management systems and the view SOA solution. IEEE Transactions on Services Computing, 2(1), 79–92. London, M. A., Holzer, T. H., Eveleigh, T. J., & Sarkani, S. (2014). Incidence matrix approach for calculating readiness levels. Journal of Systems Science and Systems Engineering, 23(4), 377–403. Lung, C., Xu, X., & Zaman, M. (2007). Software architecture decomposition using attributes. International Journal of and Knowledge Engineering, 17(5), 599–613. Madni, A., & Sievers, M. (2014). System of systems integration: Key considerations and challenges. Systems Engineering, 17(3), 330–347. Magnaye, R., Sauser, B., & Ramirez-Marquez, J. E. (2010). System development planning using readiness levels in a cost of development minimization model. Systems Engineering, 13(4), 311–323. Mankins, J. (2009). Technology readiness and risk assessments: A new approach. Acta Astronautica, 65(9–10), 1208–1215. Martin, J. (2008). Using architecture modeling to assess the societal benefits of the global earth observation system-of-systems. IEEE Systems Journal, 2(3), 304–311. McConkie, E., Mazzuchi, T., Sarkani, S., & Marchette, D. (2013). Mathematical properties of system readiness levels. Systems Engineering, 16(4), 391–400. Piaszcyk, C. (2011). Model Based Systems Engineering with Department of Defense architectural framework. Systems Engineering, 14(3), 305–326. Pittera, T., & Derrico, M. (2011). Multi-purpose modular plug and play architecture for space systems: Design, integration, and testing. Acta Astronautica, 69(7–8), 629–643. Rabelo, L., Fishwick, P., Ezzell, Z., Lacy, L., & Yousef, N. (2012). Ontology-centered integration for space operations. Journal of Simulation, 6(2), 112–124. Ramirez-Marquez, J., & Sauser, B. (2009). System development planning via system maturity optimization. IEEE Transactions of Engineering Management, 56(3), 533–548. Rezaei, R., Chiew, T., & Lee, S. (2013). A review of interoperability assessment models. Journal of Zhejiang University: Science, 14(9), 663–681. Ross, S. (2016). Application of system and integration readiness levels to Department of Defense research and development. Defense Acquisition Research Journal, 23(3), 248–273. Sauser, B., Gove, R., Forbes, E., & Ramirez-Marquez, J. E. (2010). Integration maturity metrics: Development of an integration readiness level. Information Knowledge Systems Management, 9(1), 17–46. Suh, E. S., Chiriac, N., & Hölttä-Otto, K. (2015). Seeing complex system through different lenses: Impact of decomposition perspective on system architecture analysis. Systems Engineering, 18(3), 229–240. Tan, W., Ramirez-Marquez, J., & Sauser, B. (2011). A probabilistic approach to system maturity assessment. Systems Engineering, 14(3), 279–293. Tien, J. (2008). On integration and adaptation in complex service systems. Journal of Systems Science and Systems Engineering, 17(4), 385–415. Tyree, J., & Akerman, A. (2005). Architecture decisions: Demystifying architecture. IEEE Software, 22(2), 19–27.

530 Vernay, A., & Boons, F. (2015). Assessing systems integration: A conceptual framework and a method. Systems Research and Behavioral Science, 32(1), 106–123. Zalewski, A., & Kijas, S. (2013). From principles to details: Integrated framework for architecture modeling of large scale software systems. E-Informatica Software Engineering Journal, 7(1), 45–52. Zhang, Y., Liu, X., Wang, Z., & Chen, L. (2012). A service-oriented method for system- of-systems requirements analysis and architecture design. Journal of Software, 7(2), 358–365.

531 Author Biographies

Maj Clarence Eder, USAF (Ret.), currently works as a senior principal systems engineer at the Air Force Space and Missile Systems Center (SMC), Military Satellite Communication Systems Directorate. Maj Eder holds a PhD in Systems Engineering from The George Washington University, and has over 21 years of experience in DoD space, intelligence, missile defense, and aircraft programs as a program manager, systems engineer, and space operator. In addition to his PhD, he holds a BS in Mechanical Engineering from the University of Hawaii, and an MBA from Wright State University.

(E-mail address: [email protected])

Dr. Thomas A. Mazzuchi is a professor of Engineering Management and Systems Engineering in the School of Engineering and Applied Science at the The George Washington University. He received a BA in Mathematics from Gettysburg College; and an MS and DSc, both in Operations Research, from The George Washington University. He has also served as chair of the Department of Operations Research, chair of the Department of Engineering Management and Systems Engineering, and as interim dean of the School of Engineering and Applied Science.

(E-mail address: [email protected])

532 Dr. Shahram Sarkani is a professor of Engineering Management and Systems Engineering (EMSE), and faculty advisor and academic director of EMSE Off-Campus Programs at The George Washington University (GWU). Dr. Sarkani joined the GWU faculty in 1986, where his previous administrative appoint- ments included chair of the Civil, Mechanical, and Environmental Engineering Department and interim associate dean for Research, School of Engineering and Applied Science. Sarkani holds a PhD in Civil Engineering from Rice University, and BS and MS degrees in Civil Engineering from Louisiana State University.

(E-mail address: [email protected])

533 EFFECTIVENESS TEST and EVALUATION of Non-lethal Weapons in Crowd Scenarios: METRICS, MEASURES, AND DESIGN OF EXPERIMENTS

Elizabeth Mezzacappa, Gordon Cooke, Robert M. DeMarco, Gladstone V. Reid, Kevin Tevis, Charles Sheridan, Kenneth R. Short, Nasir Jaffery, and John B. Riedener

This article discusses test and evaluation methods for benchmarking and comparison of non-lethal weapons (NLW, including munitions and devices) intended for use in crowd management situations. Several types of weapons of different modalities were tested, including a fielded acoustic non-lethal device, simulated dismounted infantry directed energy weapon, a simulated long distance directed energy weapon, and a projectile weapon. The work demonstrates that NLW effectiveness of any modality can be quantified into standard metrics and statistically analyzed. Relative effectiveness can then be compared among weapons of different technologies, platforms, and energies. The resulting information can be used for Analysis of Alternatives and data-driven trade-space studies. This testing can be applied to a variety of operational scenarios and should be conducted on current NLW inventory to establish benchmark performance and effectiveness. Once benchmark performance is established, new NLWs can be evaluated to determine if improvements are significant enough to warrant investment.

DOI: https://doi.org/10.22594/dau.16-768.24.03 Keywords: Non-lethal Weapons (NLW), Effectiveness, Test and Evaluation, Human Subjects  Image designed by Michael Krukowski Effectiveness Test and Evaluation of Non-lethal Weapons in Crowd Scenarios http://www.dau.mil

A soldier relates this story about an experience with crowds:

It was about 2006/2007 time frame. The people of northern Iraq were heading south in the pilgrimage. I don’t remember the exact date. The bad guys were planting IEDs [Improvised Explosive Devices] all over the place, killing the pilgrims by the hundreds, but they kept coming. We found an IED and started to cordon it off. We had to block the pilgrims’ path long enough for EOD [Explosive Ordnance Disposal] to blow the bomb … but the people wanted to take their chances with the bomb. We had to push them back and that made them mad. It was no less than 3,000 people at that time. It was hard, and I was worried they would attack …. They just hated the idea that we blocked the path. We asked the crowd to bring up the Imam or oldest in the crowd and they brought someone forward. We told him through the “terp” [interpreter] what was going on and he didn’t even care—he wanted to keep marching. When the EOD blew the IED, the crowd got very angry and they started chanting and approaching the cones and concertina wire. I thought for sure we were going to be run over. It was only two teams of us at that time … It was a feeling of the will to die for their beliefs. I saw all kinds of faces—young, old, men, women. All had the same look. Let us go on—we want to make it south … They were capa- ble of overrunning us easy. We could have fought for a few seconds, but it was way too many of them.

This soldier’s report of an interaction with crowds (Mezzacappa et al., 2011) illustrates a few of the factors and doc- trinal guidance for this type of mission (Headquarters, Department of the Army [HQDA], 2003; HQDA, 2005, 2006). In the passage, readers can see differences in cultural behaviors, religious motiva- tions, the importance of conversations between the interpreter (“terp”) and the Imam, and an appreciation of the group dynamics and emotions involved with interaction with crowds. Complex crowd interactions are not a novel situation for the military. However, novel weaponry,

536 Defense ARJ, July 2017, Vol. 24 No. 1 : 534-573 July 2017

that is, non-lethal weapons (NLW), have been proposed to increase a sol- dier’s options in these crowd missions, and other current and near future operations (Burgei, Foley, & McKim, 2015, pp. 30–34; National Research Council, 2003; Tafolla, Trachtenberg, & Aho, 2012). Recent Department of Defense (DoD) directives and instructions attest to the importance of this class of weapons (DoD, 2012, 2013).

NLW have been used successfully in-theater, for example, as detailed in Gudmundsson’s (n.d.) “Kosovo Incident Case Study: Use of Non-lethal Weapons.” In April of 2000, members of the NATO-led multinational task force providing security to inhabitants of Kosovo were trapped by a crowd of 50–75 Serbs. The crowd was protesting the arrest of a Serbian man. Militants within the crowd had incited the crowd to violence, and attempts by the former leaders to de-escalate the crowd had failed. Those in the back of the crowd who were on higher ground threw rocks and bottles, while those in the front pushed forward brandishing sticks. In addition, rioters came from behind and attacked from the rear. Sponge grenades were then used against the crowd. The NLW had the intended effect of causing some to fall down, some to run away, and to break up the organized behavior of the crowd. Ultimately, U.S. forces were able to escape the crowd without any soldiers killed and without killing any civilians. This incident clearly demonstrated the operational utility of NLW.

For many reasons, though, acquisition processes for all NLW have been considered to be problematic (Government Accountability Office [GAO], 2009; LeVine & Montgomery, 2002). Principally, the core issue can be identified as a lack of methods to test and evaluate the effectiveness of NLW against intended human targets (GAO, 2009; Mezzacappa, 2014; North Atlantic Treaty Organisation, 2009). A search of articles in the Defense Technical Information Center, ProQuest, and Google Scholar did not reveal any reports of test and evaluation of NLW against crowds (aside from those authored by the Target Behavioral Response Laboratory), and very little empirical work on NLW effectiveness testing in general.

The present experiment was funded by the Joint Non-lethal Weapons Program (JNLWP), the U.S. DoD authority that has oversight of NLW activities (DoD, 2013). The objective of the study was to identify behav- ioral response metrics to be used in determining NLW effectiveness. The results identify metrics, measures, and a design of experiments and analytic methods for assessing NLW effectiveness within a crowd-control force paradigm. Given the current push (DoD, 2015) for an “institutionaliza- tion of scientific test design and statistical rigor” (Warner, 2013, p. iii) in the

Defense ARJ, July 2017, Vol. 24 No. 1 : 534–573 537 test and evaluation community, we propose that the experimental results present a sound quantitative method of evaluating effectiveness of any NLW in crowd-control force interactions. First, a brief introduction to crowd research is presented. Next, the special processes for NLW acquisition will be detailed. After these two overviews, a general framework and method for effectiveness testing for NLW against crowds are presented. The method can be described as an infusion of academic social psychological methods into the developmental and operational test and evaluation procedures.

Background Crowd Research One of the foremost current crowd researchers today has described crowds as the “elephant man” of social science—“something strange, some- thing pathological, something monstrous. At the same time, they are viewed with awe and fascination” (Reicher, 2001, p. 182). Perhaps because of the difficulty of conducting research on crowds and perhaps the entrenched and fear-inducing notions of “mob mentality,” there is little direct research, especially experiments, done on crowds. What little there exists is summa- rized in Reicher’s chapter and in other sources (Kenny, et al., 2001; Miller, 2000; Reicher, 2001; Silver, 2002).

Early information-gathering attempts by the NLW community included convening of a special panel of crowd subject matter experts (SMEs). The Human Effects Advisory Panel (Kenny et al., 2001) assembled leading researchers and practitioners in crowd control science to identify what is known and not known about crowd behavior. The panel presented evidence that many of the common notions about crowd protests and riots have been found to be inaccurate. Following an extensive section on a history of riots, the report lists stereotypes that are commonly held about crowds, but have been debunked by scientists. In summary, the findings are that the mem- bers of crowds come with different motivations and seldom act in unison, that they come in groups, and that they are not anonymous to one another. Moreover, crowds are not particularly distinguished by violence and disor- der, they do not have emotional reactions that are out of the ordinary, and the members do not lose their minds, nor is there concrete evidence that violence is precipitated by a “spark” or “flashpoint.” These ideas have been well researched in the literature, particularly in the work of Clark McPhail (McPhail, 1991; McPhail & Tucker, 1990; Schweingruber & McPhail, 1999).

538 Crowds and Non-lethal Weapons More recently, Mezzacappa and colleagues conducted in-depth reviews of the literature relevant to crowd-control force interaction with non-le- thal weapons (Mezzacappa, 2009; Mezzacappa et al., 2011). As part of the review of the literature, they examined results from an online question- naire on soldiers’ accounts of crowd encounters and any NLW use in those encounters. The review and responses reveal the complex and frightening dynamics involved in crowd-control force encounters. The following points were highlighted for the field operations: (a) crowd control is more up close and personal; conventional training does not prepare one for verbal and physical abuse; (b) it is important to understand crowd motivations—they help you to control the situation; and (c) it is critical to communicate with the crowds, leaders, and members to get the information you need and to influence the behavior of the crowd. Moreover, reports from Somalia, Haiti, Brčko, Kosovo, and Iraq show that military operations with crowds vary greatly in terms of crowd motivation and U.S. mission (Mezzacappa, 2009).

Crowd control is considered to be one of the core capabilities for NLW (Bedard, 2002) and has been a driving factor in NLW initiatives (Albertson, Murphy, Jackson, & Jones, 2000; Kenny et al., 2001; Shuttleworth, 1998). Therefore, it is incumbent upon the acquisition community to develop methods of evaluating the effectiveness of NLW, specifically against crowds. Given the human complexity of crowds, proposed methods must be able to incorporate behavioral science research designs to obtain valid measures of NLW performance. All NLW though, must undergo atypical acquisition processes as discussed in the next section.

The Acquisition Process of Non-lethal Weapons The Joint Non-lethal Weapons Directorate (JNLWD) has created an acquisition process unique to NLW. The DoD realized early on that acqui- sition of this class of weapons required inclusion of expertise not normally involved in the acquisition process. Medical personnel, in particular, are needed to assist in evaluations of risk of significant injury in order to sup- port the characterization of non-lethality (LeVine & Montgomery, 2002). To address the requirements of non-lethality, DoDI 3200.19 (DoD, 2012) requires that non-lethal acquisition programs undergo review by the Human Effects Review Board (HERB) to assess the adequacy of the information on human effects, the identified potential risks, and mitigation strategies to mitigate the risks. In addition to this review, which focuses on the “non-le- thal” safety aspect, an additional scientific assessment (Burgei, Foley, & McKim, 2015) reviews the human effects characterization, modeling, and engineering efforts during the development phase of NLW.

539 Another critical component of the NLW acquisition is the DoD Non-lethal Weapons Human Effects Team, which consists of the Human Effects Center of Excellence (HECOE) and the JNLWD Human Effects Office (Simonds, 2014). The DoD NLW Human Effects Team’s primary mission is to oversee NLW Human Effects Characterization—a formal process mandated by DoDI 3200.19 (DoD, 2012) for fully describing the compendium of physiological and behavioral effects knowledge associated with a given NLW. It estab- lishes the baseline human effects understanding of NLWs, identifies risk and data gaps in human effects knowledge, and facilitates presentation and communication of its human effects (JNLWD, 2016). The team provides sup- port to science and technology, research and development (R&D), assists in transition and integration, and provides combat and material development support (Simonds, 2014).

It is important to note that the JNLWD, HERB, and HECOE serve in a con- sultative role for the Services and are a clearinghouse for information about NLW, but they do no test and evaluation. The testing that HECOE conducts is primarily in human effects in support of modeling and simulation, or M&S (JNLWD, 2016; Simonds, 2014). As such, the efforts by HECOE are

540 in support of development of NLW in that their efforts are not configured to be able to answer the particular quantitative performance questions demanded by the current acquisition process (Director, Operational Test and Evaluation [DOT&E], 2015; Warner, 2013). The JNLWD Human Effects Office is tasked with ensuring the Services have the human effects support that they need for the NLW acquisition process, but the Services are respon- sible themselves for conducting proper acquisition process procedures and subsequent decisions.

In response to this situation, this article was written to provide guidance for the acquisition workforce involved with NLW. Since 2004 and with JNLWD support, the Tactical Behavior Research Laboratory (TBRL) has conducted developmental research, and test and evaluation on a range of NLW energies, including blunt impact (Short, 2006; Short, Reid, Cooke, & Minor, 2010), light and laser (Cooke, 2007; Cooke, Mezzacappa, Yagrich, & Riedener, 2010), and acoustic stimuli (Riedener, 2007; Riedener et al., 2007a, 2007b ). The approach taken by the TBRL is based on DoDD 3000.03E guid- ance (DoD, 2013) that NLW “Deter, discourage, delay, or prevent hostile and threatening actions” (p. 2). Because the word “behaviors” can be substituted for the word “actions,” one can readily surmise that the primary science involved in NLW effectiveness research and testing is behavioral science.

Test and Evaluation of NLW The importance of behavioral science. Over the centuries behav- ioral science has invented methods for the metrics, measures, and analyses of scores of human behaviors, such as deterrence, discouragement, delay, and prevention.

The challenge for crowd NLW testers is to infuse experimental social psy- chological methods into developmental and operational test and evaluation methods. In essence, the methods of the crowd-control force paradigm for NLW test and evaluation stretch behavioral science methodologies to configure developmental and operational test and evaluation methods to accommodate the human factors related to crowd testing.

The diminished importance of replication. One point of tension between the two approaches (behavioral science versus test and evalua- tion) is the idea of replication; that is, that test conditions will ideally match conditions in-theater (DOT&E, 2015). In the case of test and evaluation for crowds, clinging to this notion may lead to dismissal of test and eval- uation of NLW and crowds as impossibly difficult. The test matrix of the bewildering array of human factors and varieties of crowd-control force scenarios would be endless. Not only would the costs render this type of

541 testing prohibitive, especially in the age of DoD funding austerity, but this insistence on replication limits the ability of the results to validly generalize to another possible control force—crowd conditions.

From the perspective of behavioral science, the key for testers of crowd NLW is to induce essential components of the crowd-control force sce- nario. Well-designed NLW experiments are studies that recreate critical social psychological forces that cause a person to engage or disengage with a control force (Cooke et al., 2010; Mezzacappa, Cooke, & Yagrich, 2008; Mezzacappa, 2014). Essentially, the crowd control force scenario consists of a plurality of people in conflict with a control force, where the control force wishes to control in some manner the movements or actions of crowd members—to deny, deter, suppress, etc. If a study can evoke this, essential dynamic results can be generalized to many other different variants of the crowd-control force encounters.

Similarities between testing of lethal and non-lethal weapons. Along these lines, it may be instructive to compare the testing of non-lethal and lethal weapons. Many variables can affect the performance of a typical lethal weapon, from operator skill and state, temperature, humidity, wind to name a few. Nevertheless, the primary tests of lethal weapon effective- ness occur in the controlled laboratory, with little concern for replication of operational realism. Limited lethal weapon testing against a real target is only used as a final demonstration to confirm the earlier controlled test- ing (except for anti-personnel weapons, which are never tested against an actual human target). In these testing conditions, lethal effectiveness can be assessed by measuring the effect that the weapon has on a standard target. For example, tank round effectiveness at penetrating armor is assessed by measuring depth of penetration into rolled steel. Small arms effectiveness at wounding soft tissue is assessed by measuring cavity volume and depth of penetration into a gel block calibrated to replicate soft tissue. In a similar manner, an NLW’s performance is assessed in the laboratory by its impact on its target, that is, how well it denies, deters, and suppresses target actions.

For lethal weapons, these laboratory results are combined with other con- trolled tests (such as round dispersion) to derive higher level metrics such as Probability of Kill (PK). PK is based on past scientific findings about how the direct measures interact with other measures of weapon characteristics. PK also is based on how the standard targets used for testing correlate to the variety of targets found on the battlefield.

542 It should be noted that lethal weapons have an advantage of over a century worth of scientific study compared to NLW. Scientific formulae, such as Gávre and deMarre’s, for penetration of ammunition into armor were already being discussed at the end of the 19th century (Maw & Dredge, 1891). The U.S. Navy was studying penetration at the turn of the century through the work of then Navy Lieutenant Cleland Davis. These findings were being taught in ordnance textbooks at the beginning of the last century (Alger, 1915; Hayes, 1938; U.S. Naval Institute, 1910). Advances in the study of lethal weapons effectiveness were incorporated into design manuals throughout the middle of the 20th century and were well established in textbooks at the end of that century (Headquarters [HQ], U.S. Army Materiel Command, 1962; HQ, U.S. Army Materiel Command, 1963; U.S. Army Ordnance Corps, 1957; U.S. Military Academy, 1999). The science of NLW effectiveness is nascent by comparison. A search of the literature reveals that there were a few years of interest around the 1950s and again in the 1970s. This interest was revived at the start of the 21st century, but overall, our understanding of NLW is over a century behind the lethal counterparts.

Examination of the well-accepted methods of lethal weapons effectiveness shows that fidelity to operational conditions is not a priority. Comparative test and evaluation of NLW may also adopt a similar approach using a similar rationale. While a myr- iad of variables are involved, we suggest that the critical metrics of NLW performance are best assessed within the confines of a test- ing facility, with experimental controls over as many variables as is pos- sible. This is not to say that in crowd scenarios the cultural, hierarchical, religious, group dynamic charac- teristics do not matter in determining crowd behavior. The point is that milestone decisions for acquisition can be based on results from laboratory test and evaluation without the emphasis on high-fi- delity crowd-control force replications. Fidelity

543 to operational conditions is not a priority in comparative test and eval- uation of lethal weapons; the same may be said for comparative test and evaluation of NLW.

Solely focused on performance. It is also critical to recognize that the behavior of the crowd is of interest only as far as it relates to the performance of the weapon. Researchers must, however, understand enough about human behavioral science to conduct test and evaluation so that results lead to valid conclusions about weapon performance (e.g., controlling extraneous variables, randomization, blinding, counterbalanc- ing, etc.) (Leedy & Ormrod, 2016). Within the acquisition process, these performance metrics are compared among candidate devices, against a benchmark, or against requirements. The task at hand is to generate a procedure that can be standardized and replicated. The goal is to derive a laboratory procedure so that, all things being equal, the device that per- forms best in the laboratory, under controlled experimental conditions, can be predicted to perform best in-theater.

Collaborative programs of test and evaluation and research. Behavioral science approaches make the NLW crowd-testing effort more tractable, but not perfect. As will be seen later in this article, the method presented can provide quantitative information upon which to make sound milestone decisions. In addition, this information can be aggregated with other information from demonstrations, subject matter expertise, or first- hand accounts of experiences to support milestone decisions on NLW. A particularly productive collaboration may be one between the laboratory scientists and M&S researchers of crowd behavior and NLW (Mezzacappa, 2014; Reid et al., 2014). Because of the great interest DoD has in developing NLW, the acquisition community needs to start to address feasible yet fruitful approaches to test and evaluation of NLW in crowd-control force operations.

The Present Study: A Methodology for Crowd NLW Effectiveness Evaluation A design of experiments. The current study, which was funded by the JNLWP, examines the approach methods by which effectiveness can be evaluated. Like effectiveness test and evaluation of lethal weapons, an understanding of the key metrics of effectiveness and metrics of perfor- mance as well as statistical means of analysis must be developed to provide rational support for acquisition decisions (DOT&E, 2015; Warner, 2013).

544 The current study demonstrates the construction of a research design that yields quantitative metrics and statistical design of experiment analyses (Montgomery, 2012). These are the required characteristics for research and testing for NLW (JNLWD, 2016; Office of Naval Research, 2014).

Cross modality test and evaluation. Earlier NLW effective- ness testing efforts typically investigated one modality or type of NLW (e.g., light, sound, blunt impact). The metrics developed in these single modality studies were then used to develop approaches that could be used to evaluate and compare effectiveness among different types of weapons (Mezzacappa, 2014; Mezzacappa et al., 2012). That is, these methods can now be used to benchmark and compare NLW of different energies (e.g., light, sound, blunt impact, etc.). In essence, these methods can be used to generate standardized laboratory metrics that are necessary for creation of a framework for evaluating effectiveness of NLW (North Atlantic Treaty Organisation, 2004, 2009).

The present study was undertaken to explore testing methodologies that can be used in comparing effectiveness of NLW of different modalities. As such, the endeavor must be understood as a methodological study, that is, the work seeks to establish an approach—a methodological framework—in which to evaluate NLW. The results speak to the ways in which NLW can be evaluated. Readers should not construe the results of this experiment as representative of the performance of any simulated NLW. Instead, it is our hope that readers are able to see that the methods presented here are appli- cable to evaluation of NLW throughout the developmental and operational test phases of the acquisition process.

Crowd-control force paradigm. The present testing occurred within a given NLW scenario—a control force facing a crowd. The acquisition-rel- evant questions addressed were:

• What are appropriate NLW effectiveness metrics?

• Can these metrics be analyzed statistically?

• Can these metrics and analyses be used to benchmark and compare effectiveness among weapons of different modalities?

545 Method Scenario Configuration All procedures were approved by the local Human Research Ethics Board (ARDEC IRB #10-0002, “Effectiveness Testing for Crowd Management with Non-Lethal Weapons”). The general paradigm was established by earlier work (Cooke et al., 2010; Mezzacappa, Cooke, & Yagrich, 2008). The testing situation was configured based on NLW count- er-personnel tasks taken from the Joint Non-lethal Effects Capabilities Based Assessment. Seven small crowds participated in a scenario facing non-lethal and simulated NLW. The tasks that were chosen for examination from those capabilities identified by the JNLWD (2016) were: “Deny access into/out of an area to individuals (open, single/few/many)” and “Deny access into/out of an area to individuals (confined, single/few/many).”

Research participants (“subjects”) were incentivized with monetary rewards to approach a target; however, approach was punished by NLW fire (i.e., they were hit by NLW). Each directed-energy and projectile-weapon condition was tested four times (four trials); the acoustic hailer and the no-weapon conditions were tested twice (two trials). Exposure to the acous- tic hailer was limited to protect the subjects against potential permanent auditory damage.

Overview of Scenario Subjects targeted an area protected by a single control-force defender at a fixed location at the center of the protected area (Figures 1 and 2). Subjects earned money for scoring points during the test by successfully approaching the protected area and throwing “rocks” into a target. The control force pro- tected the area using either foam projectiles, directed energy, or an acoustic hailer (only one weapon type per trial). Type of weapon was not revealed to subjects before the trial. Subjects could lose money by being hit by the control force during the test. Subjects were paid $20/hour for participa- tion (Cooke et al., 2010; Mezzacappa, Cooke, & Yagrich, 2008). During the experiment, a computer recorded each subject’s location, orientation, and locomotion through the test bed through motion-capture and video cameras.

546 FIGURE 1. CROWD BEHAVIOR TEST BED CONFIGURATION

Target

6

4

2

Origin 0 Y Axis (meters) Y Axis -2 Notional start line used

-4 Actual start/quit Line

-4 -2 0 2 4 X Axis (meters)

Note. Subjects started in the area marked “Actual Start/Quit Line” and traveled to the target area during the trial. On projectile and dismounted energy weapon trials, the control force stood midline at the target. The long-range energy and acoustic weapons were located on the test bed.

547 FIGURE 2. CONTROL FORCE SCENARIO

Note. Top: Subjects moving toward the target from the starting safe zone area. Bottom: Subjects moving back toward the safe zone after challenging the control force (middle of target) at the end of the trial.

548 Participants Participants were recruited from the general population to participate in an investigation on “Crowd Movement.” Fifty-two men and women par- ticipated in one of 7 experiment days. Subjects were healthy local residents or Picatinny Arsenal employees over the age of 18. To safeguard partici- pants, subjects had to pass a hearing test, be free of reactive airway disease or sensitivity to loud sounds or hearing damage, and have no history of adverse effects to brief, loud noises.

Weapon Conditions During counterbalanced experimental trials, the crowd faced one of four possible types of weapons. Subjects were instructed that they would receive weapons fire if they crossed over the line between the safe zone and the targeting area. Each weapon condition was run four consecutive times.

Acoustic hailer. An acoustic hailer is a loud sound played over a cur- rently fielded loudspeaker; an American Technology Corporation Medium Range Acoustic Device (MRAD), was used against the crowd on acoustic trials. The “warble” tone that is preprogrammed into the MRAD loud- speaker was used. Following on from previous experimentation (Riedener, 2007; Short et al., 2009), sound levels were set to deliver decibel levels of 115dB(A) in the field of exposure. For protection, hard barriers of the testbed prevented participants from approaching the MRAD any closer than 50 feet, thus ensuring safety in the exposure to 115dB(A).

The U.S. Army Public Health Center has established time limits on expo- sures for several levels of noise (U.S. Army Center for Health Promotion and Preventive Medicine, 1999). These standards limited exposure to the 115dB(A) level (about as loud as heavy construction equipment) in this experiment to a duration of no more than 28.14 seconds. Therefore, the MRAD was under computer control for the onset and offset of sounds so that the trial total exposure time was 28.14 seconds. In order to have at least two full trials of an acoustic hailer, all trials were set to be 14 seconds (28.14 sec/2 trial = ~14 per trial). The MRAD was located so that the greatest acoustic stimulation occurs at the target that the subjects were approaching. The computer control turned the sound on when subjects crossed over the line into the zone for targeting during encounters, and turned off when all subjects had left the targeting zone.

Projectile weapon. A commercially available foam projectile gun was used by the control force. The foam projectiles are tipped with Velcro mate- rial that sticks to fabric so that impacts on subjects can be recorded. The sponge or foam ends were coated with chalk powder that marked contact

549 with subjects. To facilitate visualization of the chalk powder, subjects were given white tee shirts to wear. For every hit, the subjects lost $2.50, and they could be hit multiple times per trial.

Long-range directed energy weapon. No control force person- nel were visible to the crowd under this weapon condition—simulating a directed energy weapon that is located about 1 kilometer away. A com- puter-controlled feedback system was developed to simulate the primary perceived characteristics of a long-range directed energy weapon, that is, an impact with severe consequences with no detectable source. During these trials, subjects were targeted by computer when they entered into the desig- nated fire zone. Firing was computer-controlled and automatically initiated when a subject’s location indicated he or she was in the designated area for fire. Subjects wore a commercially available alert system that emitted blinking lights, sound, and vibrations when triggered by radio signal. When subjects were hit by the weapon, they did not experience physical damage; however, $10 was taken away for every hit. Subjects could be hit only one time during each encounter.

Dismounted infantry/soldier-carried directed energy weapon. In this weapon condition, the characteristics of the weapon fire were iden- tical to those in the long-range version of the weapon, except that subjects faced a single, gun-wielding control-force person. The crowd could therefore observe the control-force person’s target selections. Simulation of a directed energy fire impact event was subsequently executed by computer com- mand. A computer-controlled feedback system was developed to simulate the primary perceived characteristics of a soldier-carried directed energy weapon. That is, in this weapon condition, crowd members could experience a weapon impact with severe consequences without a visible projectile.

No-weapon baseline condition. Subjects first targeted the protected area without a control-force defender present, and no weapon was fired. The conditions were otherwise identical to those with weapons. Because previ- ous work showed that there was little variation in response to no-weapon baseline conditions, this condition was run two consecutive times to con- serve time and monetary resources.

Dependent Variables Crowd measures: centroid and leading edge. Effectiveness mea- sures were assessed using the location and movement of the crowd as a whole, not the specific individuals that make up the crowd. Following on from previously published work (Cooke et al., 2010), the movement and

550 location of the “Leading Edge” (front of the crowd) and the “Centroid” (the geometric middle of the crowd) were derived from location and time data (Figure 1).

Effectiveness metrics. The concept of effectiveness in the crowd-con- trol force scenario was explored with several simple questions (Table 1). These effectiveness questions were then translated into mathematical terms using the individual and aggregate crowd measures.

TABLE 1. EFFECTIVENESS QUESTIONS: CROWD-CONTROL FORCE SCENARIO 1. Does the non-lethal weapon Ultimately, in a crowd-deny access task, the best make the crowd stay away in outcome is to have the crowd stay away, that is, to a safe area? (% Suppressed) suppress approach. This direct metric is based on the percentage of the crowd members that did not cross over the line separating the safe zone and the area where they would be targeted by non-lethal weapons. Motion capture data were used to identify subjects who did not leave the safe zone during the trial. For each crowd and for each trial, the count of these subjects was divided by the total number of participants in the crowd to arrive at the percentage of subjects whose approach behavior was suppressed. This metric is perhaps the most straightforward of all the effectiveness metrics proposed. 2. Does the non-lethal weapon If subjects did cross out of the safety zone, the time make a person think long at which each subject crossed the line could be and hard about facing the determined. For each crowd and for each trial, the weapon? (Hesitancy) time until crossing was averaged over all the subjects who crossed the line. Delayed approach (hesitancy) may be reasonably associated with cognitive decision making regarding cost/benefit ratio. No delay may be reasonably associated with high motivation to confront the control force. Hesitancy or latency was calculated for each individual who chose to approach. This delay was calculated as the time from start of trial until the time when the individual crossed the line to leave the safety zone to approach the protected area. 3. When a crowd is on its way For each crowd and for each trial, the movement to a protected area, by how of the centroid of the crowd was derived. For these much does the non-lethal analyses, the approach speed of the centroid toward weapon slow down the the target was used. crowd? (Approach Speed) 4. For how long does a crowd For each crowd and for each trial, the movement stay within the non-lethal of the centroid of the crowd was derived. For these weapon’s range? (Time analyses, the amount of time the centroid is located in Under Fire) the targeted area was used. 5. How close does the crowd For each crowd and each trial, the movement of the get to the non-lethal leading edge was derived. For these analyses, the weapon? (Closest Distance) closest distance from the protected area was used. Because the leading edge is the part of the crowd most likely to interact with the control force, it is the most appropriate aggregate crowd metric for these analyses.

551 All NLW performance was compared to a no-weapon baseline where approach was not opposed. That is, if the crowd approaches just as quickly and as closely to a protected area when a weapon is there as opposed to when there is no weapon, then the weapons can be judged as ineffective.

Data Analysis Analyses were carried out to address the questions:

• Is the NLW better than nothing?

• Is any single NLW more effective than the other weapons tested?

Data were analyzed using multivariate repeated measures linear regression methods (Montgomery, 2012). First, omnibus regressions compared the no-weapon baseline condition with each of the NLW on all five effectiveness measures (% Suppression, Hesitancy, Approach Speed, Closest Approach, Time Under Fire) with the experimental design Condition (No-Weapon vs. Weapon) x Trial (1, 2, 3, 4). These analyses were run separately for the acoustic, dismounted directed energy, long-range directed energy, and projectile weapon. The no-weapon condition was run twice. Therefore, to enable comparisons against the weapons with four trials, data from Trials 1 and 2 were entered again for Trials 3 and 4. The baseline data were miss- ing from the first crowd run; the means from the other six crowd runs were used as substitutions.

Second, those NLW that were found to be significantly different from base- line were run again in another similar omnibus regression comparing performance among the weapons. Regression analyses were run on change scores derived from the difference in effectiveness metrics from baseline and under the weapon conditions, with the design Weapon (Dismounted Directed Energy, Long-Range Directed Energy, Projectile) x Trial (1, 2, 3, 4).

Results Graphics/Images – Overview Figures 3–7 show each NLW’s performance on each of the effectiveness metrics. The figures are discussed with respect to the questions that eval- uate an NLW’s effectiveness (Table 2).

552 FIGURE 3. MEAN PERCENTAGE OF CROWD SUPPRESSED

0.9 d

e 0.8 s s e r

p 0.7 p u S

0.6 d w

o 0.5 r C

f

o 0.4

e g

a 0.3 t n e c

r 0.2 e P 0.1

0 1 2 3 4 Trial

None Acoustic Long-Range Directed Energy Dismounted Directed Energy Projectile

Note. Mean and standard error of percentage of crowd members who choose not to venture into the line of fire, that is, approach behavior is suppressed. The higher the percentage, the more suppression the weapon exerts. Note that the data for the acoustic hailer completely overlap the data for the no-weapon condition.

553 FIGURE 4. MEAN DELAY TO APPROACH/HESITANCY 12

10

8

6

4

2 Latency/Hestiancy, seconds 0 1 2 3 4 -2 Trial

None Acoustic Long-Range Directed Energy Dismounted Directed Energy Projectile

Note. Mean and standard error of time until people decide to start approaching protected area over trials for each weapon. The higher the point, the longer the person takes until starting approach. Note that the data for the acoustic hailer completely overlap the data for the no-weapon condition.

FIGURE 5. MEAN SPEED OF APPROACH

2.5

2

1.5

1 Speed, meters per sec 0.5

0 1 2 3 4 Trial

None Acoustic Long-Range Directed Energy Dismounted Directed Energy Projectile

Note. Mean and standard error of speed of crowd approach on each of the trials for each of the weapons. The higher the point is, the faster the approach toward the protected area.

554 FIGURE 6. MEAN CLOSEST DISTANCE ON APPROACH

10

9 s r e

t 8 e m

, 7 a e r A

6 d e t

c 5 e t o

r 4 P

o t

3 e c n

a 2 t s i

D 1

0 1 2 3 4 Trial None Acoustic Long-Range Directed Energy Dismounted Directed Energy Projectile

Note. Mean and standard error for closest distance of crowd to protected area on each trial for each weapon. The higher the point, the farther away the crowd stayed from the protected area. Note that the data for the acoustic hailer almost completely overlap the data for the no-weapon condition.

FIGURE 7. MEAN TIME IN LINE OF FIRE

12 s

d 10 n o c e s 8 , e r fi

f

o 6

e n i l

n 4 i

e m i

T 2 0 1 2 3 4 Trial

None Acoustic Long-Range Directed Energy Dismounted Directed Energy Projectile

Note. Mean and standard error for time crowd spends in the line of fire for each weapon on each trial. The higher the point, the longer the crowd stays within range of the weapon.

555 TABLE 2. EFFECTIVENESS QUESTIONS: NON-LETHAL WEAPONS (NLW)

1. Does the NLW convince the Figure 3 shows the average percentage of subjects crowd to stay away in a safe in a crowd who chose not to approach the protected area? area on each trial for each weapon. The acoustic hailer condition is identical to the no-weapon condition; no crowds were suppressed to any degree in either of these conditions. The most suppression was recorded under the long-range directed energy condition (60%) while the dismounted directed energy and projectile conditions recorded less suppression. 2. Does the NLW make a Figure 4 shows the average amount of time that person think long and hard passes for each NLW before a subject starts toward before facing the weapon? the protected area for each of the trials. Again, the delay associated with the acoustic hailer is again identical to that of the no-weapon condition; that is, there is no delay in starting toward the protected area. Subjects showed the greatest hesitation in starting toward the target during the long-range directed energy condition, with the dismounted directed energy and projectile weapons showing less of a hesitation by the subjects in the crowd. 3. When a crowd is on its way Figure 5 shows the average speed at which the crowd to a protected area, by how moves toward the protected area for each of the trials much does an NLW slow and each of the weapons, calculated as the speed down the crowd? of centroid of mass of crowd. The graph shows that the crowd moves fastest when there is no weapon protecting the area. The crowd moves slowest when faced with the projectile weapon and the control force, at least when first challenged by the weapon. 4. How close does the crowd Figure 6 shows on average how close crowds got get to the NLW? to the protected area for each trial for each NLW, calculated for the leading edge of the crowd. Crowds reached closest to the protected area when there was no weapon protecting it and when the area was protected with an acoustic hailer. Crowds stayed farthest away when faced with the long-range directed energy weapon. 5. For how long does a crowd Figure 7 shows the average amount of time the crowd stay within an NLW’s range? stays within range of each of the NLW on each trial, calculated as time the centroid of mass of the crowd stayed within the range of the weapon. Crowds spent the longest time within firing range during the no-weapon and acoustic hailer condition. The least amount of time spent in firing range was when the crowd was faced with the long-range directed energy weapon.

Statistical Analyses: Is the NLW Better Than Nothing? Acoustic hailer vs. no weapon. Univariate tests showed no statisti- cally significant differences between effectiveness measures under acoustic vs. no-weapon conditions (Table 3).

556 Dismounted directed energy weapon vs. no weapon. Univariate tests showed statistically significant differences between the effectiveness measures of Approach Speed and Closest Approach under dismounted directed energy weapon vs. no-weapon conditions. When faced with the control force wielding this directed energy weapon, the crowd approached more slowly and stayed farther away compared to when there was no weapon present (Table 3).

Long-range directed energy weapon vs. no weapon. Univariate tests showed statistically significant differences on all effectiveness -mea sures. When faced with this directed energy weapon, more subjects were suppressed, subjects hesitated more, approached more slowly, stayed farther away, and spent less time within firing range compared to when there was no weapon present (Table 3).

Projectile weapon vs. no weapon. Univariate tests showed statis- tically significant differences on all effectiveness measures except Time Under Fire where there was a statistical trend toward significance. When faced with this projectile weapon, more subjects were suppressed, subjects hesitated more, approached more slowly, stayed farther away, and spent slightly less time within firing range compared with when there was no weapon present (Table 3).

Statistical Analyses: Is any single NLW more effective than the other weapons tested? The acoustic hailer failed to show a difference from the no-weapon con- dition; therefore, this weapon was left out of these analyses. Change scores from no-weapon baseline were calculated for each weapon and each trial for each crowd. These change scores were entered into an omnibus multivariate repeated measures regression analysis. Effectiveness measures were com- pared simultaneously among the dismounted directed energy, long-range directed energy, and projectile weapons.

Multivariate-level analyses indicated a significant overall difference in the effectiveness measures among the weapons. Univariate analyses indi- cated that the NLW differed significantly in % Suppression and Hesitancy (Table 4). Within-subject contrasts indicated that compared with the projec- tile weapon condition, the long-range directed energy weapon was associated with greater % Suppression and greater Hesitancy (both F(1,6)=6.14, p<.05). Within-subject contrasts also indicated that compared with the dismounted directed energy weapon condition, the long-range directed energy weapon was associated with slightly greater % Suppression and slightly greater Hesitancy (both F(1,6)=4.58, p=.08).

557 TABLE 3. DIFFERENCES IN EFFECTIVENESS METRICS COMPARED TO NO WEAPON Dismounted Long-Range Metric of Acoustic Directed Energy Directed Energy Projectile Effectiveness Hailer Weapon Weapon Weapon Approach F(1,6)=19.31, p<.01 F(1,6)=7.52, p<.05 F(1,6)=47.79, Speed * p<.0001 Closest F(1,6)=7.26, p<.05 F(1,6)=26.98, F(1,6)=27.04, Approach * p<.005 p<.005

% Suppression F(1,6)=15.45, p<.01 F(1,6)=7.02, * * p<.05 Hesitancy F(1,6)=11.85, p<.05 F(1,6)=6.59, * * p<.05 Time Under F(1,6)=48.91, F(1,6)=5.65, Fire * * p<.0001 p=.055) * Denotes that there was no statistically significant difference in this metric of effectiveness between this NLW device and no weapon in the test bed.

TABLE 4. MULTIVARIATE-LEVEL ANALYSES OF METRICS OF EFFECTIVENESS Omnibus Multivariate-Level Analysis F(10,18)=3.08, p<.05 Univariate-Level Analyses % Suppression F(2,12)=3.97, p<.05 Hesitancy F(2,12)=3.97, p<.05

Learning Effects Across all the statistical analyses, there were significant effects of Trial. These findings indicate that as the crowds learned about the vulnerabili- ties of the control force and the control-force weapons, they adjusted their countermeasures to overcome penalties. In some cases, the countermeasure was complete avoidance of the control force, that is, behavioral suppres- sion. As a result, in these cases effectiveness of NLW increased with trials. From Figures 3–7, one can see that the learning effect is especially pro- nounced for the simulated long-range directed energy weapon. As the trials progressed, this weapon showed increasing effectiveness in deterring approach toward the protected area. This finding has implications for tactics, techniques, and procedures for using similar existing and future directed energy systems, such as the Active Denial System (LeVine, 2009).

Discussion Summary of Statistical Results This test demonstrated that common human performance metrics could be used to compare across weapon systems of varying modalities. The results indicate that the acoustic hailer was ineffective, and overall

558 the long-range dismounted directed energy weapon was the most effective weapon. When faced with the long-range directed energy weapon, the sub- jects were less likely to attempt to approach the protected area and took longer to decide to approach than when faced with the dismounted directed energy weapon or the projectile weapon. Crowds modified their tactics in dealing with the control force as they learned what countermeasures were effective against receiving penalties; therefore, in some cases the effectiveness of the weapons increased as the crowd better understood the capabilities of the weapons.

A Test and Evaluation Methodology To reiterate, the purpose of this experiment was to demonstrate a methodology for NLW effectiveness testing in a crowd scenario. The results from analyses of the simulated weapons can in no way be generalized to these classes of weapons, or any other weapons. The sole exception is the results for the acoustic hailer. Indeed, the results presented here replicate the results from previous reports, which examined a number of different acoustic stimuli at different wavelengths (Riedener, 2007; Riedener et al., 2007a, 2007b; Short et al., 2009; VanMeenen, 2006). This design and anal- ysis of experiments offers several benefits for the acquisition community for the development, and test and evaluation of NLW.

Benefits of this Method Quantitative analysis of alternatives. As the Results section showed, statistical analyses can provide the basis for quantitative evalu- ations of whether or not an NLW performs better than no weapon at all, and can identify superior performance among a set of devices. In addition, as illustrated in Figures 3–7, a design of experiments approach allows for quantitative Analyses of Alternatives (AoA) that includes effectiveness parameters. For example, Figure 3 indicates that by Trial 4, an average of 60 percent of the crowd was suppressed by the simulated long-range directed energy weapon, compared to an average 5 percent of the crowd that was suppressed by the projectile weapon. Analysts can then use these metrics to examine trade spaces involving cost and effectiveness, and other considerations. Simply put, the decision becomes whether or not the average 55 percent greater suppression of the crowd is worth the costs and effort of acquiring the directed energy weapon. Similarly, as can be seen in Figure 7, in the first trial the acoustic weapon manages to keep the crowd to an average of 0.5 meters from the protected area, whereas the crowd is held to an average of 7 meters beyond that. AoAs can include assessments of the cost of the additional average of 7 meters of stand-off performance improvements afforded by the projectile.

559 Quantitative benchmarking and requirements generation. The ability to measure quantitative measures of NLW performance cre- ates the ability of the acquisition community to benchmark performance against which candidate devices can be evaluated. For example, the 7.5 meter stand-off afforded by the projectile may set the target stand-off distance for the next generation of NLW to be double or triple this bench- mark. Numerical evaluation allows the measurement of improvement. Moreover, quantitative methodologies can put into perspective the cur- rent requirements as identified and set by the JNLWD (Office of Naval Research, 2014). One can use the present methodology to both assess the current capabilities and better understand the challenges of meeting the requirements set by decision makers.

Developmental engineering and specifications relevant to NLW effectiveness. One of the purposes of the present experiment was to demonstrate a method of test and evaluation that allows for comparisons of weapons of different modalities. Beyond comparison of the effectiveness of entire systems, comparisons of the effectiveness of specific components can be undertaken with the methods discussed herein. Finer resolution weapon characteristics (e.g., platforms, fire control, soldier interfaces) can also be tested for their impact on NLW effectiveness. Requirements gen- eration and benchmarking then can also be set for critical factors such as accuracy, reload time, slew rate, or impact area—all of which are known to affect the effectiveness of weapons in general.

Research on the underlying psychological mechanisms of NLW effectiveness.Without question, crowd behavior, targeted or not by NLW, necessarily involves more variables that can be manipulated, controlled, eliminated, recorded, or observed in any study, no matter how large or comprehensive. The proposed methodology provides a robust paradigm in which psychosocial and cultural variables can be explored for their effects on NLW effectiveness. Within these research designs, psychosocial variables can be varied with subsequent effectiveness measures compared among the different conditions. Thus, this is a paradigm that can be used to explore how different NLW of different modalities may fare against crowds of differing characteristics.

Perceived Limitations on NLW Effectiveness Testing To our knowledge, this is the first effort to produce a procedure of gen- erating empirical performance metrics capable of meeting the DoD test and evaluation definition of quantitative performance of NLW against crowds. Its shortcomings are recognized. The prime shortcoming is that there may

560 be limitations on generalization of results, the issue that recurs in every controlled experiment done in a laboratory. Restrictions arising from safety, logistical, financial, and ethical concerns necessarily constrain the ideal scientific testing procedures. However, we propose that some commonly perceived limitations are found to be, under closer scrutiny, far less debil- itating for the acquisition process. These perceived limitations are briefly discussed here because they may be perceived to apply to all laboratory-con- trolled NLW effectiveness testing.

Motivation and NLW effectiveness. A more particular criticism can be directed toward motivation levels. Logically and ethically, it is impossible to create in the laboratory the varied and extremely high levels of motivation that are assumed to be found in natural crowds encountered in-theater. In the laboratory, we control degree of motivation through monetary rewards and penalties. In real life, motivation is controlled by unknown forces, be they political, religious, or mundane. Crowds may be annoyed at, hostile to, or indifferent toward control forces. The criticism is then how can findings from the relatively placid New Jersey population extrapolate to other populations in other quite different cultures?

In examining this criticism, we again must restrict the discussion to issues relevant to DoD acquisition test and evaluation. The core question: “How does a target’s motivation relate to a weapon’s performance—the ability to deny, disable, move, or suppress his, her, or their behavior?” Very simply, it is expected that the more motivated a target is, the lesser is the ability of the NLW to deny, disable, move, or suppress that target.

561 Note that in this relationship, the kind of motivation (e.g., religious, finan- cial, appetitive) appears to matter less than degree of motivation (e.g., high, moderate, low).

If then, degree of motivation is the construct of interest, one may be able to see how extrapolation from the laboratory to theater can be valid. From the NLW operator’s point of view, denying, disabling, moving, or suppressing a highly motivated religious pilgrim may not be very different than denying, disabling, moving, or suppressing a highly motivated greedy capitalist. Manipulating degree of motivation is certainly more feasible than manipulating type of motivation. Within the bounds of research ethics and fundings, it may be possible to induce high levels of motivation within the laboratory.

It is recognized, however, that it is unlikely the laboratory can induce the highest levels of motivation that compel in-theater civilians. Even so, the findings of laboratory testing are valuable for NLW as a screening method. That is, if an NLW fails to show effectiveness against neutral crowds from New Jersey suburbs, it will most certainly fail against an in-theater hostile crowd. This proposition has been dubbed “The NJ Screening Test for Non- lethal Weapons” (Mezzacappa, 2014).

Institutional review boards forbid live-fire NLW testing. Readers might argue that NLW cannot be tested and evaluated through live-fire. They might protest that because of the ethical issues of targeting people with real NLW, Institutional Review Boards (IRB) and Human Research Protection Offices will refuse to review and approve NLW protocols. While it is beyond the scope of this article to fully discuss the IRB issues relevant to NLW, a short response to that statement is that the statement is not true. Experiments using NLW fires against intended human targets can be proposed to, and approved by, IRBs if the risks of the experiment to the subjects are outweighed by the scientific benefits from the study. This approval can be possible only if: (a) a complete set of safety data is available for the NLW, (b) risks to targets are fully described, (c) risks are properly mitigated within the research procedures, (d) all risks are fully and clearly described to potential subjects, and (e) subjects’ consent to participate are freely given. Resistance on the part of IRBs has not been the real obstacle to NLW effectiveness research. Rather, we have found in our decade-long experience, the difficulty in running live-fire NLW testing is the inability to receive from vendors comprehensive safety data. Without complete risk data, investigators cannot: (a) present to potential subjects a complete description of the risks they will be accepting with exposure to

562 the NLW, and (b) properly mitigate the risks. It is this lack of safety data, not a lack of safety, that renders NLW experimentation unacceptable to IRBs. When IRBs can ensure that all relevant information can be presented to a subject for consideration, and that risks are adequately mitigated, then IRBs can approve live-fire testing with NLW.

Conclusions Again, the results here should not be construed as evaluation of the specific weapons tested. The results should be used as guidelines for test and evaluation of all NLW. This experiment demonstrates derivation of quantitative metrics and statistical effectiveness analyses of NLW per- formance. Using similar methods, relative effectiveness can be compared among weapons of different technologies, platforms, and energies. Similar testing paradigms, adapted to the particular weapon and concept of opera- tion, will support the acquisition of NLW from concept through deployment. The resulting information can be used for AoA and data-driven trade studies.

As operational scenarios and desired results become part of the NLW’s testable requirement, test beds can be designed to capture the effectiveness metrics that represent effectiveness for that weapon within that specific sce- nario or task. Once a non-lethal system is proven effective in the laboratory setting, then correlations to actual operational performance and effective- ness can be made from field reports. As the science of NLW effectiveness marches forward, there will be an expected evolution of improved test bed designs and metrics used to quantify the effectiveness of the NLW tested.

Therefore, the primary initial recommendation is to gather benchmark data on NLW effectiveness in the manner presented in this article. These methods enable the DoD acquisition community to gather benchmark data on NLW effectiveness, then to build upon and improve an item. Using test data will not only ensure that improvements are being made, but that the improvements are significant enough to warrant acquisition of a new item, thus giving the warfighter enhanced capability.

Acknowledgements This experiment was funded by the Joint Non-lethal Weapons Program (JNLWP).

563 References Albertson, M., Murphy, R., Jackson, A., & Jones, M. (2000, April). Civil disturbances: Incorporating non-lethal technology, tactics, techniques and procedures (CALL Newsletter No. 00-7). Fort Leavenworth, KS: Center for Army Lessons Learned. Alger, P. R. (1915). The groundwork of practical naval gunnery. Annapolis, MD: U.S. Naval Institute. Bedard, E. (2002). Nonlethal capabilities: Realizing the opportunities. Defense Horizons, 9, 1–6. Burgei, W. A., Foley, S. E., & McKim, S. M. (2015). Developing non-lethal weapons. Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Cooke, G. M. (2007). Effectiveness of light on targeting (Report No. TBRL-MCSE- TR-07001/ARQES-TR-08001/AD-E403 156). Picatinny Arsenal, NJ: Target Behavioral Response Laboratory. Cooke, G., Mezzacappa, E., Sheridan, C., DeMarco, R., Tevis, K., Reid, G., … Riedener, J. (2010, December). Topology and individual location of crowds as a measure of effectiveness for non-lethal weapons. In Proceedings of the 27th Army Science Conference, Orlando, FL. Cooke, G., Mezzacappa, E., Yagrich, K., & Riedener, J. (2010, November). Effects of lasers on driving. In Directed Energies Professional Society, Proceedings of Thirteenth Annual Directed Energy Symposium, Bethesda, MD. Department of Defense. (2012). Non-Lethal Weapons (NLW) human effects characterization (DoDI 3200.19). Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Department of Defense. (2013). DoD executive agent for non-lethal weapons (NLW), and NLW policy (DoDD 3000.03E). Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology and Logistics. Department of Defense. (2015). DOT&E Test and Evaluation Master Plan (TEMP) guidebook. Washington, DC: Office of the Director, Operational Test and Evaluation. Government Accountability Office. (2009). Defense management: DoD needs to improve , policy, and testing to enhance ability to field operationally useful non-lethal weapons (Report No. GAO-09-344). Washington, DC: U.S. Government Printing Office. Gudmundsson, B. (2002). Kosovo incident case study: Use of non-lethal weapons [White paper]. Quantico, VA: U.S. Marine Corps, Joint Non-lethal Weapons Directorate. Hayes, T. J. (1938). Elements of Ordnance: A textbook for use of cadets of the United States Military Academy. New York: J. Wiley & Sons. Headquarters, Department of the Army. (2003). Peace Ops: Multi-service tactics, techniques, and procedures for conducting peace operations (FM 3-07.31). Langley AFB, VA: Air Land Sea Application Center. Headquarters, Department of the Army. (2005). Civil disturbance operations (FM 3-19.15). Washington, DC: Author. Headquarters, Department of the Army. (2006). Counterinsurgency (FM 3-24). Washington, DC: Author. Headquarters, U.S. Army Materiel Command. (1962). Research and development of materiel, engineering design handbook, elements of terminal ballistics (part one),

564 introduction, kill mechanisms and vulnerability (AMC Pamphlet [AMCP] 702-160). Washington, DC: Author. Headquarters, U.S. Army Materiel Command. (1963). Research and development of materiel, engineering design handbook, elements of armament engineering (part two), ballistics (AMCP 702-107). Washington, DC: Author. Joint Non-lethal Weapons Directorate. (2016). Joint non-lethal weapons program science and technology strategic plan. Quantico, VA: Joint Non-lethal Weapons Directorate. Kenny, J. M., McPhail, C., Waddington, P., Heal, S., Ijames, S., Farrer, N. D., … Odenthal, D. (2001). Crowd behavior, crowd control, and the use of non-lethal weapons. University Park, PA: Institute for Non-lethal Defense Technologies. Leedy, P., & Ormrod, J. (2016). Practical research: Planning and design (11th ed.). Hoboken, NJ: Pearson. LeVine, S. (2009). The Active Denial System: A revolutionary, non-lethal weapon for today’s battlefield. Retrieved from National Defense University website: http:// jnlwp.defense.gov/ LeVine, S., & Montgomery, N. (2002, July-August). Non-lethal weapon human effects: Establishing a process for DoD program managers. Program Manager, 31(4), 50–54. Maw, W. H., & Dredge, J. (1891). Modern French artillery, ballistic efficiency of quick firing Hotchkiss guns (No. LXII). Engineering, 51, 393–395. McPhail, C. (1991). The myth of the madding crowd. New York: A. de Gruyter. McPhail, C., & Tucker, C. (1990). Purposive collective action. American Behavioral Scientist, 34, 81–94. Mezzacappa, E. (2009). Crowd dynamics and military interaction with non-lethal weapons and systems (JNLWD11-006). Picatinny Arsenal, NJ: Target Behavioral Response Laboratory. Mezzacappa, E. (2014). Effectiveness testing of non-lethal weapons. Journal of Defense Modeling and Simulation, 11, 91–101. Mezzacappa, E., Cooke, G., Merenda, B., Jaffery, N., Galonski, L., Hedderich, E., … Riedener, J. (2011). Crowd dynamics and military interactions review updated (Report No. JNLWD12-004). Quantico, VA: Joint Non-lethal Weapons Directorate. Mezzacappa, E., Cooke, G., Reid, G., DeMarco, R., Sheridan, C., & Riedener, J. (2011, March). Mathematical capture of human crowd behavioral data for computational model building, verification and validation. In Proceedings of the 20th Annual Behavior Representation in Modeling & Simulation (BRiMS) Conference, Sundance, Utah. Mezzacappa, E., Cooke, G., Reid, G., DeMarco, R., Sheridan, C., & Riedener, J. (2012, March). Data-driven modeling of target human behavior in military operations. In Proceedings of the 21st Annual Behavior Representation in Modeling & Simulation (BRiMS) Conference, Amelia Island, FL. Mezzacappa, E., Cooke, G., & Yagrich, K. (2008, December). Network science and crowd behavior metrics. In Proceedings of the 26th Army Science Conference, Orlando, FL. Mezzacappa, E. S., Sheridan, C., DeMarco, R., Tevis, K., Reid, G., Short, K., … Riedener, J. (2012). Tactical checkpoint – hail/warn and suppress/stop. Journal of Directed Energy, 4, 255–274.

565 Miller, D. (2000). Introduction to collective behavior and collective actions. Prospect Heights, IL: Waveland Press. Montgomery, D. (2012). 15.4 repeated measures. In D. Montgomery, Design and analysis of experiments (8th ed.). Hoboken, NJ: Wiley. National Research Council. (2003). An assessment of non-lethal weapons science and technology. Washington, DC: The National Academies Press. North Atlantic Treaty Organisation/Science & Technology Organisation. (2004). Non-lethal weapons effectiveness assessment (RTO TR-085). Brussels, Belgium: Author. North Atlantic Treaty Organisation/Research & Technology Organisation. (2009). Non-lethal weapons effectiveness assessment development and verification study (RTO TR-SAS-060). Neuilly-sur-Seine Cedex, France: Author. Office of Naval Research. (2014). Joint non-lethal weapons program fiscal year 2015 non-lethal weapons technologies (Broad Area Announcement ONRBAA14-008). Arlington, VA: Author. Reicher, S. (2001). The psychology of crowd dynamics. In M. A. Hogg & R. S. Tindale, Group processes. Hoboken, NJ: Wiley Blackwell Publishers. Reid, G., Cooke, G., DeMarco, R., Weaver, C., Riedener, J., & Mezzacappa, E. (2014, April). Mathematical capture of human data for computer model building and validation. In Proceedings of 23rd Annual Conference on Behavior Representation in Modeling and Simulation (BRiMS 2014), Washington, DC. Riedener, J. (2007). Acoustic weaponization: Deterrence on target (Report No. TBRL- AW-TR-07003). Picatinny Arsenal, NJ: Target Behavioral Response Laboratory. Riedener, J., & Mezzacappa, E. (2012, March). Data-driven modeling of human behavior in military operations. In Proceedings of the 21st Annual Behavior Representation in Modeling & Simulation (BRIMS) Conference, Amelia Island, FL. Riedener, J., Short, K., Mezzacappa, E., Cooke, G., Sheridan, C., Jaffery, N., … Yagrich, K. (2007a). Acoustic weaponization: Deterrence on target low frequency (Report No. TBRL-AW-TR-07004). Picatinny Arsenal, NJ: Target Behavioral Response Laboratory. Riedener, J., Short, K., Mezzacappa, E., Sheridan, C., Jaffery, N., Jones, R., & Yagrich, K. (2007b). Acoustic weaponization: Deterrence on target, contingency (Report No. TBRL-AW-TR-07005). Picatinny Arsenal, NJ: Target Behavioral Response Laboratory. Schweingruber, D., & McPhail, C. (1999). A method for systematically observing and recording collective behavior. Sociological Methods and Research, 27(4), 451– 498. Short, K. B. (2006). Blunt impact as deterrent: Human approach-avoidance behaviors and other stress responses studied within a paintball gaming context. In G. T. Shwaery, J. G. Blitch, & C. Land (Eds.), Enabling technologies and design of nonlethal weapons (Vol. 6219), Proceedings of SPIE Optics + Photonics 2016 Conference & Exhibition (pp. 62190H1–12), San Diego, CA. Short, K., Reid, G., Cooke, G., & Minor, T. R. (2010, November). Can repeated painful blunt impact deter approach toward a goal? Proceedings of the 27th Army Science Conference, Orlando, FL. Short, K., Riedener, J., Mezzacappa, E., Sheridan, C., DeMarco, R., Tevis, K., …Yagrich, K. (2009). Convoy protection against aggressive acts. Picatinny Arsenal, NJ: Target Behavioral Response Laboratory.

566 Shuttleworth, A. (1998, December). Crowd control and the human effects of non- lethal weapons. Paper presented at Jane’s Non-Lethal Weapons Conference, London, UK. Silver, M. (2002). Tactics, training, and procedures for the warfighter reacting to crowd dynamics (Report No. AFRL/HE/BR/TR-2002-0149). Brooks AFB, TX: Air Force Research Laboratory/Human Effectiveness Directorate. Simonds, J. (2014). Non-lethal weapons human effects. Brooks AFB, TX: Human Effects Center of Excellence. Tafolla, T. J., Trachtenberg, D. J., & Aho, J. A. (2012). From niche to necessity: Integrating nonlethal weapons into essential enabling capabilities. Joint Forces Quarterly, 66, 3rd Qtr, 71–79. U.S. Army Center for Health Promotion and Preventive Medicine. (1999). Dosimetry and risk assessment (TG 181). Aberdeen Proving Ground, MD: Author. U.S. Army Ordnance Corps. (1957). Ordnance engineering design handbook, artillery ammunition series (section 2), design for terminal effects (ORDP 20-245). Washington, DC: Author. U.S. Military Academy. (1999). Mechanical design [ME402 class text]. West Point, NY: Department of Mechanical Engineering. U.S. Naval Institute. (1910). Ordnance and gunnery: A text-book prepared for the use of the midshipmen of the United States Naval Academy. Annapolis, MD: Author. VanMeenen, K. S. (2006). Suppression: Sound and light interference with targeting In G. T. Shwaery (Ed.), Enabling technologies and design of nonlethal weapons (Vol. 6219), Proceedings of SPIE Optics + Photonics 2016 Conference & Exhibition (pp. 62190J1-11), San Diego, CA. Warner, C. (2013). Report on the test science roadmap. Washington, DC: Office of the Director, Operational Test and Evaluation.

567 Author Biographies

Dr. Elizabeth Mezzacappa is currently a sci- entist with the U.S. Army Armament Research, Development and Engineering Center Tactical Behavior Research Laboratory (ARDEC TBRL); she also serves as Adjunct faculty with the Army’s Armament Graduate School. Her research inter- ests include human dimensions of military operations. Dr. Mezzacappa holds a BA in Biology and Psychology from the University of Pennsylvania; and a PhD in Medical Psychology from the Uniformed Services University of the Health Sciences.

(E-mail address: [email protected])

Dr. Gordon Cooke is currently the laboratory chief of the ARDEC TBRL where he has served as a research engineer and principal investigator. His research interests include the performance of humans in military environment scenarios. He is a graduate of the U.S. Military Academy at West Point and former combat engineer officer. Dr. Cooke holds a PhD in Biomedical Engineering from Stevens Institute of Technology.

(E-mail address: [email protected])

568 Mr. Robert M. DeMarco is currently a measure- ment and automation specialist who serves as a lead software developer for the ARDEC TBRL. He is a Certified LabVIEW Developer, with over 15 years of experience programming. He also serves as an adjunct professor in the New Jersey Institute of Technology, Biomedical Engineering Department, where he teaches virtual instru- mentation. He holds a BS in Computer Engineering and an MS in Biomedical Engineering from the New Jersey Institute of Technology.

(E-mail address: [email protected])

Mr. Gladstone V. Reid is currently a biomedical instrumentation engineer supporting human behavioral research at the ARDEC TBRL. He recently served as laboratory chief of the ARDEC TBRL and has developed over 30 instrumented Test Beds over his career at the New Jersey Institute of Technology, the Department of Veterans Affairs, Rutgers University, and ARDEC TBRL. He holds a BS in Electrical Engineering and an MS in Biomedical Engineering from the New Jersey Institute of Technology.

(E-mail address: [email protected])

569 Mr. Kevin Tevis is currently the laboratory chief of the Radiographic Laboratory, but has served in many diverse roles/positions over his 10-year career at ARDEC. From 2007 to 2014, Mr. Tevis performed both lead engineering efforts and principal investigator duties at the ARDEC TBRL. Mr. Tevis holds a master’s degree in Mechanical Engineering from Stevens Institute of Technology and a Master’s degree in Business Administration from Florida Institute of Technology.

(E-mail address: [email protected])

Mr. Charles Sheridan is currently a research assistant at the ARDEC TBRL, serving as the link between the research staff and the human research participants. He has been responsible for the recruitment, scheduling, and administration of the informed consent process, subject protection, and general interaction with over a thousand subjects for the laboratory. He holds a BA in English from Fairleigh Dickinson University.

(E-mail address: [email protected])

570 Dr. Kenneth R. Short is currently the Provost, U.S. Army Armament Graduate School and a scientist at the ARDEC TBRL. His research interests include behavioral and neurochemical consequences of stress, especially in traumatic or military contexts, and relation to subsequent anxiety. Dr. Short holds a BA in Psychology from Swarthmore College; and an MA in Experimental Psychology and PhD in Neuroscience from the University of Colorado.

(E-mail address: [email protected])

Mr. Nasir Jaffery is currently a quality engineer for the ARDEC TBRL, serving as experiment test lead, assisting in behavioral data collection and analysis, and participating in institutional review board approval processes. He has contributed to several armament programs and International Organization for Standardization 9000 audits. He holds a BS in electrical engineering from the State University of New York at Buffalo and an MBA from the Florida Institute of Technology.

(E-mail address: [email protected])

571 Mr. John B. Riedener is currently the FAST- 7ATC Science & Technology Advisor to the ARDEC Atlantic 7th Army Training Command in Grafenwoehr, Germany. In 2016, he was labo- ratory chief for the ARDEC TBRL, with oversight of dozens of human subjects research projects. He holds a BA in Computer and Information Science from the New Jersey Institute of Technology and an MS in Systems Engineering from the Stevens Institute of Technology.

(E-mail address: [email protected])

572 573 A Publication of the Defense Acquisition University http://www.dau.mil

The Defense Acquisition We encourage our readers to submit Professional Reading List is intended book reviews they believe should to enrich the knowledge and under- be required reading for the defense standing of the civilian, military, acquisition professional. The books contractor, and industrial workforce themselves should be in print or gen- who participate in the entire defense erally available to a wide audience; acquisition enterprise. These book address subjects and themes that recommendations are designed have broad applicability to defense to complement the education and acquisition professionals; and pro- training vital to developing essen- vide context for the reader, not tial competencies and skills of the prescriptive practices. Book reviews acquisition workforce. Each issue of should be 450 words or fewer, the Defense Acquisition Research describe the book and its major Journal will include one or more ideas, and explain its relevancy to reviews of suggested books, with defense acquisition. Please send more available on our Website http:// your reviews to the managing edi- dau.mil/library. tor, Defense Acquisition Research Journal at [email protected]. Featured Book

Destructive Creation: American Business and the Winning of World War II Author: Mark R. Wilson Publisher: University of Pennsylvania Press Copyright Date: 2016 Hardcover: 392 pages ISBN: 9780812248333 Reviewed by: Dr. Benjamin Franklin Cooling, Professor of National Security Studies, The Eisenhower School, National Defense University July 2017

Review: In some circles, a popularized dogma is that the real winners of World War II were private business and industry. Just ask popularizer Art Her- man, Freedom’s Forge (2012), and now academic Mark Wilson. Wilson best invites criticism of all parties and concludes that self-serving corpo- rate memory won. The touch of anti-statism that permeates all America particularly enabled public relations to build a compelling alternate his- tory or dogma, approximating more the South’s Lost Cause interpreta- tion after the Civil War or Weimar Germany’s after World War I. Wilson has contributed previously to The Business of Civil War and knows how to ferret out details and synthesize analysis from published and unpub- lished sources. This present work tells a tale stretching from two world wars to the Cold War, with World War II as the high table for interpreting the private sector as savior of the Free World. He also suggests how the struggle between public and private sectors for heroes, villains, and “who won” has continued to shape economic and political development even today. Arguably, his most useful chapter is his last, styled “reconversion.” Here, Wilson builds into a tight discussion of privatizing resourcing in the early Cold War, thereby fulfilling and perpetuating America’s capital- ist dream. Yet it was World War II that “offers sobering lessons about the power of economic elites to shape American politics.”

Wilson’s narrative invites reflection on how “business and government were reluctant, contentious, and even bitter partners” in the war effort. In fact, Wilson shows World War II public-private partnering as a con- tinuation of battles between corporate and government, capital and la- bor, regulation and free enterprise as well as in-fighting between politi- cians, military, and private elite personalities and philosophies dating back to the Progressive Era. Wilson recounts the story from shadows of the Great War (World War I) that conditioned how the United States planned for the next one, to building the arsenal (even before Pearl Har- bor’s casus belli), through what Wilson styles “one tough customer” or the government’s exacting price constraints and tight regulation in the second conflagration while inducing product competition yet also pro- viding massive public investment to get the job done. Wilson also dips into unsavory wartime labor unrest—strikes and seizures that contradict the notion that all America put shoulders to the common weal in patri- otic unity. Wilson’s book is hard hitting, but balanced, detailed without being pedantic, and eminently stimulating.

Aside from what Wilson suggests World War II teaches us (maybe less

Defense ARJ, July 2017, Vol. 24 No. 3 : 574–577 575 A Publication of the Defense Acquisition University http://www.dau.mil

than he thinks), does his book yield anything useful for acquisition pro- fessionals today? No doubt the system of public-private partnering in World War II was made more robust by political statist philosophy, mar- ginal private weapons-production capability until underwritten by cen- tral government financing, and innovative infrastructure provisions. “Ar- senic and red tape” as one long-lost postwar account called it, or Bruce Catton’s classic War Lords of Washington (1948), reveals in more colorful prose the machinations of top officials (and those have not gone away). Moreover, the times dictated private capacity to augment, not replace, government in-house capabilities and basic innovative fiscal creation of warfighting, not merely deterrent capability. The choice of quantity over quality superimposed on the exigency of fighting a hot war was an industrial/mobilization age emergency measure (or “expedient corpo- ratism”) and reflected best, perhaps, by the penultimate public/private endeavor—the Manhattan Project. Wilson really wants us to focus on the continuum of liberal progressive-protectionist conservative friction; the battle of free enterprise versus state socialism is still with us. Only at the end does the reader realize the basic deference and dependency that befell the government’s national security/common defense respon- sibilities; by the time of Eisenhower, McNamara has reached its epitome today. Wilson’s story is not an explanation resting upon the evolving na- ture of war, technological change, or even the fundamental tautological difference between industrialized world wars and the atomic Cold War. The reader must do the work of extrapolation. Wilson’s book does stand at the threshold of better explaining the migration of the Arsenal of De- mocracy to the Military-Industrial Complex and National Security State.

576 Defense ARJ, July 2017, Vol. 24 No. 3 : 574–577

New Research in DEFENSE ACQUISITION

Academics and practitioners from around the globe have long con- sidered defense acquisition as a subject for serious scholarly research, and have published their findings not only in books, but also as Doctoral dissertations, Master’s theses, and in peer-reviewed journals. Each issue of the Defense Acquisition Research Journal brings to the attention of the defense acquisition community a selection of current research that may prove of further interest.

These selections are curated by the Defense Acquisition University (DAU) Research Center and the Knowledge Repository. We present here only the author/title, abstract (where available), and a link to the resource. Both civil- ian government and military Defense Acquisition Workforce (DAW) readers will be able to access these resources on the DAU DAW Website: https:// identity.dau.mil/EmpowerIDWebIdPForms/Login/KRsite. Nongovernment DAW readers should be able to use their local knowledge management cen- ters and libraries to download, borrow, or obtain copies. We regret that DAU cannot furnish downloads or copies.

We encourage our readers to submit suggestions for current research to be included in these notices. Please send the author/title, abstract (where avail- able), a link to the resource, and a short write-up explaining its relevance to defense acquisition to: Managing Editor, Defense Acquisition Research Journal, [email protected].

578 Defense ARJ, July 2017, Vol. 24 No. 3 : 578–584 THE ECONOMICS OF ENTERPRISE TRANSFORMATION: AN ANALYSIS OF THE DEFENSE ACQUISITION SYSTEM Michael J. Pennock

Abstract: Despite nearly 50 years of attempts at reform, the U.S. defense acqui- sition system continues to deliver weapon systems over budget, behind schedule, and with performance shortfalls. A parade of commissions, pan- els, and oversight organizations has studied and restudied the problems of government acquisition with the objective of transforming the defense acquisition enterprise, yet the resulting legislative and procedural changes have yielded little, if any, benefit. Thus, the obvious question is, “Why has acquisition reform failed?” Three potential contributors were identified in the literature: misalignment of incentives, a lack of a systems view, and a lack of objective evaluation criteria. This dissertation attempts to address each of these problem areas.

First, the author considers the issue of incentivization in the context of defense technology policy. A frequent criticism of defense acquisition pro- grams is that they tend to employ risky, immature technology that increases the cost and duration of acquisition efforts. To combat this problem, the

Defense ARJ, July 2017, Vol. 24 No. 3 : 578–584 579 A Publication of the Defense Acquisition University http://www.dau.mil

Department of Defense rewrote its acquisition regulations to encourage a more evolutionary approach to systems development. Nominally, this requires the use of mature technologies, but studies have revealed that acquisition programs continue to use immature technologies in spite of the new policies. To analyze this issue, the defense acquisition cycle was modeled as a stochastic process. Then, assuming that each acquisition program serves a diverse set of stakeholders, game theory was applied to show that the stable solution is to employ immature technology. It turns out that there is a tragedy of the commons at work in which the acquisition program serves as the common resource for each of the stakeholder groups to achieve its objectives. Since there is no cost to using the resource, there is a tendency to overexploit it. The result is an outcome that is worse than if there had been a coordinated solution. Thus, the rational actions of stake- holders will lead to a contradiction of acquisition policy. Consequently, if the Department of Defense expects adherence to its evolutionary acquisition policy, it must either strictly enforce technology maturity requirements or else realign incentives with desired outcomes. Second, the author evaluates cost and performance implications of the most recent defense acquisition transformation initiative—evolutionary acquisition. Proponents suggest that evolutionary acquisition will lower acquisition program costs, shorten delivery times, and improve the performance of fielded systems through the use of shorter and more incremental acquisition cycles. Supporting arguments focus on the impact of evolutionary acquisition on individual programs, but fail to consider the defense acquisition enterprise as a system.

To address this shortcoming, the author analyzes the impact of evolutionary policies through the use of a discrete event simulation of the entire defense acquisition system. It was found that while there should be an increase in the performance of fielded systems under evolutionary acquisition policies, the cost of operating the defense acquisition system as a whole does not inherently decrease. This is because the shorter acquisition cycles created by evolutionary polices mean that the overhead costs of each acquisition cycle are incurred more frequently. If these overhead costs do not decline sufficiently, the net cost to operate the acquisition system rises. This finding demonstrates the importance of considering the entire acquisition system before implementing a new policy. Finally, the author addresses the lack of objective evaluation criteria by developing a method to value acquisition process improvements monetarily. This is accomplished through the com- bination of price indices and options analysis.

580 Defense ARJ, July 2017, Vol. 24 No. 3 : 578–584 July 2017

Since the U.S. government is a nonprofit entity, traditional cash flow-based valuation methods are not applicable. Instead, the use of price indices cap- tures the changes in the government’s buying power induced by acquisition reforms. This may be converted into an equivalent, augmented budget stream that allows traditional investment evaluation tools to be applied. An additional advantage of the buying power method is that it captures the impact of the economies of scale inherent in the production of military systems. The augmented budget stream serves as the basis for applying options analysis, which properly accounts for the risk-mitigating effects of staging. A comparison of this new method with more traditional methods reveals that only considering cost savings can significantly undervalue acquisition improvement opportunities, and even small improvements can have large returns.

Citation: Pennock, M. J. (2008). The economics of enterprise transformation: An analysis of the defense acquisition system (Order No. 304645429). Available from ProQuest Dissertations & Theses Global. Retrieved from https://search.proquest.com/ docview/304645429?accountid=40390

ISSUES WITH ACCESS TO ACQUISITION DATA AND INFORMATION IN THE DEPARTMENT OF DEFENSE: DOING DATA RIGHT IN WEAPON SYSTEM ACQUISITION Megan McKernan, Nancy Young Moore, Kathryn Connor, Mary E. Chenoweth, Jeffrey A. Drezner, James Dryden, Clifford Grammich, Judith D. Mele, Walter T. Nelson, Rebeca Orrie, Douglas Shontz, & Anita Szafran

Abstract: Acquisition data and information are the foundation for decision mak- ing, management, and oversight of weapon system acquisition programs. They are critical to initiatives to improve defense acquisition, such as Better Buying Power. The Department of Defense as a whole gathers a wide variety of acquisition information and stores it in multiple, sometimes incompatible systems, most of which are built for reporting, not analysis. Large busi- nesses have similar problems, and the concept of master data management may have lessons for both. The authors review 21 key acquisition-related

Defense ARJ, July 2017, Vol. 24 No. 3 : 578–584 581 A Publication of the Defense Acquisition University http://www.dau.mil

data information systems and their origins and uses, and identify how acquisition data might be improved. They also summarize background on acquisition data; review commercial practices in data management; and offer findings and recommendations to further improve acquisition data quality, access, and use.

Citation: McKernan, M., Moore, N. Y., Connor, K., Chenoweth, M. E., Drezner, J. A., Dryden, J., … Szafran, A. (2017). Issues with access to acquisition data and information in the department of defense: Doing data right in weapon system acquisition. Retrieved from: https://www.rand.org/pubs/research_reports/RR1534.html

A LOW DISHONEST DECADE ..: SMART ACQUISITION AND DEFENCE PROCUREMENT INTO THE NEW MILLENNIUM John Louth

Abstract: Smart acquisition was the change programme introduced at the end of the twentieth century charged with transforming the effectiveness of defence procurement within the United Kingdom (UK). The initiative was rolled-out as a cornerstone of the Blair government’s strategic defence initiative from 1998 onwards, and today represents the management phi- losophy, public sector organisational structures, and UK industrial strategy for delivering defence equipment. This research seeks to understand the manner and extent of changes to defence procurement derived from the smart acquisition initiative, viewed as a ‘technology’ through which gov- ernment exercises power. Accordingly, understanding smart acquisition develops and deepens our knowledge of the nature of government itself. The author offers, in chapters 1 and 2 initially, an introduction to smart acquisition, its background, and historical antecedence. He then discusses the methodology employed for interrogating the phenomenon as an auto/ ethnographical study of UK defence practices. Chapter 3 details the fac- tors that drove defence reorganisation, whilst chapter 4 derives smart acquisition as rational and benign managerial change. Chapter 5 critiques this perspective by unveiling smart acquisition as a neoliberal construct through which government procures and cements assemblages of regimes of control and socialisation, legitimised through managerial narratives and

582 governmentalist forms. Consequently, a revised critical analytical model of smart acquisition embracing governmentalist notions is provided in chapter 6. Chapter 7 introduces a specific defence procurement project team and describes its transformation strategy and emerging business model. In chapter 8, the project team is superficially revealed as a rational change agent embedding and embracing management reform. Chapter 9 critiques this, presenting the team as a constructed governmentalist regime, an expression of control, socialisation, and surrender of agency. Chapter 10 concludes the research by observing that smart acquisition is a complex set of understandings, and a multiplicity of forms and discourses.

Citation: Louth, J. (2010). A low dishonest decade ..: Smart acquisition and defence procurement into the new millennium (Order No. 1780172723). Available from ProQuest Dissertations & Theses Global. Retrieved from https://search.proquest. com/docview/1780172723?accountid=40390

DELUSION AND DECEPTION IN LARGE INFRASTRUCTURE PROJECTS: TWO MODELS FOR EXPLAINING AND PREVENTING EXECUTIVE DISASTER Bent Flyvbjerg, Massimo Garbuio, & Dan Lovallo

Abstract: “Over budget, over time, over and over again” appears to be an appropri- ate slogan for large, complex infrastructure projects. This article explains why cost, benefits, and time forecasts for such projects are systematically overoptimistic in the planning phase. The underlying reasons for forecast- ing errors are grouped into three categories: delusions or honest mistakes, deceptions or strategic manipulation of information or processes, or bad luck. Delusion and deception have each been addressed in the management literature before, but here they are jointly considered for the first time. They are specifically applied to infrastructure problems in a manner that allows both academics and practitioners to understand and implement the suggested corrective procedures. The article provides a framework for analyzing the relative explanatory power of delusion and deception. It also suggests a simplified framework for analyzing the complex principal-agent relationships that are involved in the approval and construction of large infrastructure projects, which can be used to improve forecasts. Finally,

583 A Publication of the Defense Acquisition University http://www.dau.mil

the article illustrates reference class forecasting, an outside view de-bias- ing technique that has proven successful in overcoming both delusion and deception in private and public investment decisions.

Citation: Flyvbjerg, B., Garbuio, M., & Lovallo, D. (2009, Winter). Delusion and deception in large infrastructure projects: Two models for explaining and preventing executive disaster. Available from ProQuest Dissertations & Theses Global (Order No. 215864135). Retrieved from https://search.proquest.com/ docview/215864135?accountid=40390.

Defense ARJ Guidelines FOR CONTRIBUTORS The Defense Acquisition Research Journal (ARJ) is a scholarly peer- reviewed journal published by the Defense Acquisition University (DAU). All submissions receive a blind review to ensure impartial evaluation.

IN GENERAL We welcome submissions from anyone involved in the defense acqui- sition process. Defense acquisition is defined as the conceptualization, initiation, design, development, testing, contracting, production, deploy- ment, logistics support, modification, and disposal of weapons and other systems, supplies, or services needed for a nation’s defense and security, or intended for use to support military missions.

Research involves the creation of new knowledge. This generally requires using material from primary sources, including program documents, policy papers, memoranda, surveys, interviews, etc. Articles are characterized by a systematic inquiry into a subject to discover/revise facts or theories with the possibility of influencing the development of acquisition policy and/or process.

We encourage prospective writers to coauthor, adding depth to manuscripts. It is recommended that a mentor be selected who has been previously published or has expertise in the manuscript’s subject. Authors should be familiar with the style and format of previous Defense ARJs and adhere to the use of endnotes versus footnotes (refrain from using electronically embedded endnotes), formatting of reference lists, and the use of designated style guides. It is also the responsibility of the corresponding author to furnish any required government agency/employer clearances with each submission.

586 Defense ARJ, July 2017, Vol. 24 No. 3 : 586–593 MANUSCRIPTS Manuscripts should reflect research of empirically supported experi- ence in one or more of the areas of acquisition discussed above. The Defense ARJ is a scholarly research journal and as such does not publish position papers, essays or other writings not supported by research firmly based in empirical data. Empirical research findings are based on acquired knowl- edge and experience versus results founded on theory and belief. Critical characteristics of empirical research articles:

• Clearly state the question,

• define the research methodology,

• describe the research instruments (e.g., program documenta- tion, surveys, interviews),

• describe the limitations of the research (e.g., access to data, sample size),

• summarize protocols to protect human subjects (e.g., in sur- veys and interviews) , if applicable,

• ensure results are clearly described, both quantitatively and qualitatively,

• determine if results are generalizable to the defense acquisi- tion community,

587 A Publication of the Defense Acquisition University http://www.dau.mil

• determine if the study can be replicated, and

• discuss suggestions for future research (if applicable).

Research articles may be published either in print and online, or as a Web- only version. Articles that are 5,000 words or less (excluding abstracts, references, and endnotes) will be considered for print as well as Web pub- lication. Articles between 5,000 and 10,000 words will be considered for Web-only publication, with an abstract (150 words or fewer) included in the print version of the Defense ARJ. In no case should article submissions exceed 10,000 words.

Book Reviews Defense ARJ readers are encouraged to submit book reviews they believe should be required reading for the defense acquisition professional. The reviews should be 500 words or fewer describing the book and its major ideas, and explaining why it is relevant to defense acquisition. In general, book reviews should reflect specific in-depth knowledge and understanding that is uniquely applicable to the acquisition and life cycle of large complex defense systems and services.

Audience and Writing Style The readers of the Defense ARJ are primarily practitioners within the defense acquisition community. Authors should therefore strive to demonstrate, clearly and concisely, how their work affects this community. At the same time, do not take an overly scholarly approach in either content or language.

Format Please submit your manuscript with references in APA format (author- date-page number form of citation) as outlined in the Publication Manual of the American Psychological Association (6th Edition). References should include Digital Object Identifier (DOI) numbers when available. The author(s) should not use automatic reference/bibliography fields in text or references as they can be error-prone. Any fields should be converted to static text before submission. For all other style questions, please refer to the Chicago Manual of Style (16th Edition).

Contributors are encouraged to seek the advice of a reference librarian in completing citation of government documents because standard formulas of citations may provide incomplete information in reference to government works. Helpful guidance is also available in The Complete Guide to Citing Government Information Resources: A Manual for Writers and Librarians (Garner & Smith, 1993), Bethesda, MD: Congressional Information Service.

588 July 2017

Pages should be double-spaced in Microsoft Word format, Times New Roman, 12-point font size and organized in the following order: title page (titles, 12 words or fewer), abstract (150 words or fewer to conform with formatting and layout requirements of the publication), two-sentence sum- mary, list of keywords (five words or fewer that do not appear in the title of the manuscript), reference list (only include works cited in the paper), and author’s note or acknowledgments (if applicable). Manuscripts submitted as PDFs will not be accepted.

Figures or tables should not be inserted or embedded into the text, but submitted as a separate file in the original software format in which it was created. For additional information on the preparation of figures or tables, refer to the Scientific Illustration Committee, 1988,Illustrating Science: Standards for Publication, Bethesda, MD: Council of Biology Editors, Inc. Restructure briefing charts and slides to look similar to those in previous issues of the Defense ARJ.

The author (or corresponding author in cases of multiple authors) should attach a signed cover letter to the manuscript that provides all of the authors’ names, mailing and e-mail addresses, as well as telephone and fax numbers. The letter should verify that the submission is an original product of the author(s); that all the named authors materially contributed to the research and writing of the paper; that the submission has not been previously published in another journal (monographs and conference pro- ceedings serve as exceptions to this policy and are eligible for consideration for publication in the Defense ARJ); and that it is not under consideration by another journal for publication.

COPYRIGHT The Defense ARJ is a publication of the United States Government and as such is not copyrighted. Because the Defense ARJ is posted as a complete document on the DAU Website, we will not accept copyrighted manuscripts that require special posting requirements or restrictions. If we do publish your copyrighted article, we will print only the usual caveats. The work of federal employees undertaken as part of their official duties is not subject to copyright except in rare cases.

Web-only publications will be held to the same high standards and scru- tiny as articles that appear in the printed version of the journal and will be posted to the DAU Website at www.dau.mil.

589 A Publication of the Defense Acquisition University http://www.dau.mil

In citing the work of others, please be precise when following the author- date-page number format. It is the contributor’s responsibility to obtain permission from a copyright holder if the proposed use exceeds the fair use provisions of the law (see the latest edition of Circular 92: Copyright Law of the United States of America and Related Laws Contained in Title 17 of the United States Code, Washington, DC: U.S. Government Printing Office). Contributors will be required to submit a copy of the writer’s permission to the managing editor before publication.

We reserve the right to decline any article that fails to meet the following copyright requirements:

• The author cannot obtain permission to use previously copy- righted material (e.g., graphs or illustrations) in the article.

• The author will not allow DAU to post the article in our Defense ARJ issue on our Internet homepage.

• The author requires that usual copyright notices be posted with the article.

• To publish the article requires copyright payment by the DAU Press.

SUBMISSION All manuscript submissions should include the following:

• Cover letter

• Author checklist

• Biographical sketch for each author (70 words or less) • Headshot for each author saved as a 300 dpi (dots per inch) JPEG or Tiff file no less than 5x7 inches with a plain back- ground in business dress for men (shirt, tie, and jacket) and business appropriate attire for women. All active duty military should submit headshots in Class A uniforms. Please note: low-resolution images from Web, PowerPoint, or Word will not be accepted due to low image quality. • One copy of the typed manuscript, including:

°° Title (12 words or less)

590 °° Abstract of article (150 words or less)

°° Two-line summary

°° Keywords (5 words or fewer—do not include words appear- ing in the manuscript title)

°° Document double-spaced in Microsoft Word format, Times New Roman, 12-point font size (5,000 words or less for the printed edition and 10,000 words or fewer for online-only content excluding abstracts, figures, tables, and references).

°° Copyright release form

All forms are available at our website: www.dau.mil/library/arj. Submissions should be sent electronically, as appropriately labeled files, to the Defense ARJ managing editor at: [email protected].

591 Defense ARJ PRINT SCHEDULE

The Defense ARJ is published in quarterly theme editions. All submis- sions are due by the first day of the month. See print schedule below.

Author Deadline Issue

July January

November April

January July

April October

In most cases, the author will be notified that the submission has been received within 48 hours of its arrival. Following an initial review, submis- sions will be r­ eferred to peer reviewers and for subsequent consideration by the Executive Editor, Defense­ ARJ.

592 Defense ARJ, July 2017, Vol. 24 No. 3 : 592–593 January 1

Contributors may direct their questions to the Managing Editor, Defense ARJ, at the address shown below, or by calling 703-805-3801 (fax: 703-805- 2917), or via the Internet at [email protected].

The DAU Homepage can be accessed at: http://www.dau.mil

DEPARTMENT OF DEFENSE

DEFENSE ACQUISITION UNIVERSITY

ATTN: DAU PRESS (Defense ARJ)

9820 BELVOIR RD STE 3

FORT BELVOIR, VA 22060-5565

593 CALL FOR AUTHORS

We are currently soliciting articles and subject mat- ter experts for the 2017 Defense Acquisition Research Journal (ARJ) print year. Please see our guidelines for contributors for submission deadlines.

Even if your agency does not require you to publish, consider these career-enhancing possibilities: • Share your acquisition research results with the Acquisition, Technology, and Logistics (AT&L) community. • Change the way Department of Defense (DoD) does business. • Help others avoid pitfalls with lessons learned or best practices from your project or program. • Teach others with a step-by-step tutorial on a process or approach. • Share new information that your program has uncovered or discovered through the implementation of new initiatives. • Condense your graduate project into something beneficial to acquisition professionals.

ENJOY THESE BENEFITS: • Earn 25 continuous learning points for We welcome submissions from anyone in- publishing in a refereed journal. volved with or interested in the defense ac- • Earn a promotion or an award. quisition process—the conceptualization, • Become part of a focus group sharing initiation, design, testing, contracting, pro- similar interests. duction, deployment, logistics support, mod- • Become a nationally recognized expert ification, and disposal of weapons and other in your field or specialty. systems, supplies, or services (including con- • Be asked to speak at a conference struction) needed by the DoD, or intended for or symposium. use to support military missions.

If you are interested, contact the Defense ARJ managing editor ([email protected]) and provide contact information and a brief description of your article. Please visit the Defense ARJ Guidelines for Contributors at http://www.dau.mil/library/arj/p/ARJ-Guidelines.

Defense ARJ and Defense AT&L Online-only for individual subscribers

NEW Online presence for easier use on mobile and desktop devices: https://www.dau.mil/library/defense-atl https://www.dau.mil/library/arj

PLEASE SUBSCRIBE or resubscribe so you will not miss out on accessing future publications. Send an e-mail to [email protected] and/or [email protected], giving the e-mail address you want us to use to notify you when a new issue is posted.

Type “Add to LISTSERV” in the subject line.

Also use this address to notify us if you change your e-mail address SURVEY

Please rate this publication based on the following scores: 5 —Exceptional 4 — Great 3 — Good 2 — Fair 1 — Poor

Please circle the appropriate response.

1. How would you rate the overall publication? 5 4 3 2 1

2. How would you rate the design of the publication? 5 4 3 2 1

True False a) This publication is easy to read b) This publication is useful to my career c) This publication contributes to my job effectiveness d) I read most of this publication e) I recommend this publication to others in the acquisition field

If hand written, please write legibly.

3. What topics would you like to see get more coverage in future Defense ARJs?

4. What topics would you like to see get less coverage in future Defense ARJs?

5. Provide any constructive criticism to help us to improve this publication:

6. Please provide e-mail address for follow up (optional):

FREE

ONLINESUBSCRIPTION

Defense ARJ Defense AT&L Thank you for your interest in Defense Acquisition Research Journal and Defense AT&L magazine. To receive your complimentary online subscription, please write legibly if hand written and answer all questions below—incomplete forms cannot be processed.

*When registering, please do not include your rank, grade, service, or other personal identifiers.

New Online Cancellation Subscription Change E-mail Address

Date

Last Name:

First Name:

Day/Work Phone:

E-mail Address:

Signature: (Required)

PLEASE FAX TO: 703-805-2917

The Privacy Act and Freedom of Information Act In accordance with the Privacy Act and Freedom of Information Act, we will only contact you regarding your Defense ARJ and Defense AT&L subscriptions. If you provide us with your business e-mail address, you may become part of a mailing list we are required to provide to other agencies who request the lists as public information. If you prefer not to be part of these lists, please use your personal e-mail address.

SUBSCRIPTION

ver 01/03/2017

We’re on the Web at: http://www.dau.mil/library/arj

Articles represent the views of the authors and do not necessarily reflect the opinion of DAU or the Department of Defense. Current. Connected. Innovative.