American Statistical Association Undergraduate Guidelines Workgroup Curriculum Guidelines for Undergraduate Programs in Statistical Science Promoting the Practice and Profession of ® ACKNOWLEDGMENTS The American Statistical Association undergraduate guidelines working group was convened by ASA President Nathaniel Schenker in the spring of 2013. Members included Beth Chance, Steve Cohen, Scott Grimshaw, Johanna Hardin, Tim Hesterberg, Roger Hoerl, Nicholas Horton (chair), Chris Malone, Rebecca Nichols, and Deborah Nolan. We greatly appreciate the many me mbers of the community who provided feedback on earlier drafts of these guidelines. CONTENTS

Executive Summary ...... 4 Introduction...... 5 Background and Guiding Principles ...... 6 Skills Needed...... 9 Curriculum for Statistics Majors ...... 11 Curriculum Topics for Minors or Concentrations...... 14 Additional Points ...... 15

These guidelines were endorsed by the American Statistical Association Board of Directors on November 15, 2014. A copy of the guidelines and related resources can be found at www.amstat.org/education/curriculumguidelines.cfm. EXECUTIVE SUMMARY he American Statistical Association endorses the skills, students should be fluent in higher-level program- value of undergraduate programs in statistics as ming languages and facile with database systems. a reflection of the increasing importance of the Tdiscipline. We expect statistics programs to provide suffi- Real applications. Data should be a major component cient background in the following core skill areas: statis- of statistics courses. Programs should emphasize con- tical methods and theory, data management, computa- cepts and approaches for working with complex data tion, mathematical foundations, and statistical practice. and provide experiences in designing studies and ana- Statistics programs should be flexible enough to prepare lyzing non-textbook data. bachelor’s graduates to either be functioning statisticians or go on to graduate school. More diverse models and approaches. Students The widely cited McKinsey report states that “by require exposure to and practice with a variety of pre- 2018, the United States alone could face a shortage of dictive and explanatory models in addition to methods 140,000 to 190,000 people with deep analytical skills for model-building and assessment. They must be able as well as 1.5 million managers and analysts with the to understand issues of design, confounding, and bias. know-how to use the analysis of Big Data to make effec- They need to know how to apply their knowledge of tive decisions.” A large number of those will be at the theoretical foundations to the sound analysis of data. bachelor’s level. The number of bachelor’s graduates in statistics has increased by more than 140% since 2003 Ability to communicate. Students need to be able to (21% from 2012 to 2013). communicate complex statistical methods in basic terms to Much has changed since the previous guidelines were managers and other audiences and to visualize results in an disseminated in 2000. The 2014 guidelines reflect changes accessible manner. They must have a clear understanding of in curriculum and suggested pedagogy. Institutions need ethical standards. Programs should provide multiple oppor- to ensure students entering the work force or heading to tunities to practice and refine these statistical practice skills. graduate school have the appropriate capacity to “think with data” and to pose and answer statistical questions. These guidelines are intended to be flexible while ensuring that programs provide students with the Key points appropriate background and necessary critical thinking Increased importance of data science. Working with and problem solving skills to thrive in our increasingly data requires extensive computing skills. To be prepared data-centric world. Programs are encouraged to be cre- for statistics and data science careers, students need ative with their curriculum to provide a synthesis of the- facility with professional statistical analysis software, the ory, methods, computation, and applications. ability to access and wrangle data in various ways, and A copy of the guidelines and related resources can the ability to perform algorithmic problem solving. In be found at https://goo.gl/Ncjf3v (last updated November addition to more traditional mathematical and statistical 15, 2014).

4 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science INTRODUCTION NOTES tatistics is an increasingly important discipline, TRENDS IN STATISTICS DEGREES AWARDED 1 Data are from IPEDS (Integrated concentration in statistics are not Post-Secondary Education Data included in these numbers. spurred by the proliferation of complex and rich System Completions Survey, 3 See www.tinyurl.com/ data and the growing recognition of the role ncsesdata.nsf.gov/webcaspar) mckinsey-nextfrontier. statisticalS analysis plays in making evidence-based through 2013, where first and second majors were counted 4 See www.maa.org/ decisions. Enrollments in statistics classes have been in , statistics, programs/faculty-and increasing dramatically. More students are entering mathematical statistics and -departments/ingenious. probability, statistics [other], college having completed a statistics class, and more 5 We focus primarily on majors, and mathematics and statistics since the development of students are studying statistics at the college level. [other]. Although the number of bachelor’s graduates in sta- a deep understanding of 2 See, for example, http:// statistical science and associated tistics is still relatively small in absolute terms (1,656 magazine.amstat.org/ computational and data-related 1 according to IPEDS ), this number has increased blog/2014/09/01/degrees and skills requires extensive study. We markedly from 2003, when only 673 statistics under- http://magazine.amstat.org/ also describe key points related 2 blog/2013/05/01/stats-degrees. to minor programs and similar graduate degrees were conferred . The University of California/ types of concentrations or tracks There is growing demand for a variety of strong Berkeley (n=143) was the largest through other majors. producer in 2013, with Purdue undergraduate programs in statistics to help prepare 6 See www.amstat.org/education/ University a close second curriculumguidelines.cfm for the the next generation of students to make sense of the (n=135). Other institutions with 2000 guidelines and related information around them. The widely cited McKinsey 40 or more graduates in 2013 resources. & Company report stated that “by 2018, the United included University of Illinois/ States alone could face a shortage of 140,000 to Urbana-Champaign, University of California/Davis, University 190,000 people with deep analytical skills as well as of Minnesota, University of 1.5 million managers and analysts with the know-how Michigan, and University of to use the analysis of Big Data to make effective deci- California/Los Angeles. There may 3 be substantial undercounting, sions.” While some of these new workers will need since students completing graduate training, much of the demand is expected to a mathematics major with a be at the bachelor’s level4. Source: NCES IPEDS The American Statistical Association (ASA) endorses the value of undergraduate programs in sta- tistical science, both for statistics majors and students for important changes in the field. We lay out general in other majors seeking a minor or concentration5. goals and specific recommendations identified during Much has changed since the previous ASA guide- our deliberations from 2013–2014. lines, which were approved in 20006. This document We begin by discussing principles that informed describes updated and expanded guidelines for cur- our thinking, then consider skills students should ricula for undergraduate programs (majors, minors, develop in their courses, and finally summarize key and concentrations) in statistical science that account curriculum topics.

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 5 7 The K–12 GAISE guidelines (www.amstat.org/education/gaise) define statistical problem solving as an investigative process that BACKGROUND AND involves four components: (1) Formulate questions (clarify the problem at hand, then formulate one (or more) questions that can GUIDING PRINCIPLES be answered with data); (2) Collect data (design a plan to collect appropriate data, then employ The scientific method and its relation to the statisti- including nonstatistical justifications (subject matter the plan to collect the data); (3) cal problem solving cycle: Undergraduates need prac- knowledge) for evaluating a research conclusion. Students Analyze data (select appropriate graphical and numerical methods, tice using all steps of the scientific method to tackle real need to be aware of possible limitations, to assess when a then use these methods to analyze research questions. All too often, undergraduate statistics more complex analysis is warranted, and to decide when the data); and (4) Interpret results majors are handed a “canned” data set and told to ana- to reformulate the question. (interpret the analysis, then relate the interpretation to the original lyze it using the methods currently being studied. This question). It should be emphasized approach may leave them unable to solve more complex Real applications: The Committee on the Undergraduate that this process is rarely sequential. problems out of context, especially those involving large, Program in Mathematics Curriculum Guide from 2004 In addition, see Wild and Pfannkuch unstructured data. The statistical analysis process involves reinforced the importance of real applications and data (1999) “Statistical thinking in 9 empirical enquiry,” International formulating good questions, considering whether avail- analysis . They stated: Statistical Review, 67(3):223–248 able data are appropriate for addressing the problem, and Pfannkuch and Wild (2000) choosing from a set of different tools, undertaking the [T]he analysis of data provides an opportunity for “Statistical thinking and statistical students to gain experience with the interplay practice: Themes gleaned from analyses in a reproducible manner, assessing the analytic professional statisticians,” Statistical methods, drawing appropriate conclusions, and com- between abstraction and context that is critical for the Science, 15(2):132–152. municating results7. Students need practice developing mathematical sciences major to master. Experience with 8 There is a need for additional a unified approach to statistical analysis and integrating data analysis is particularly important for majors entering continuing professional multiple methods in an iterative manner. Instructors the workforce directly after graduation, for students with development for instructors interests in allied disciplines, and for students preparing and revisions to the graduate need appropriate background in applied statistics and the curricula that will prepare future statistical problem solving cycle to be able to effectively to teach secondary mathematics. instructors. Because many faculty teach these courses8. teaching statistics do not have a graduate degree in statistics, there This scientific approach to statistical problem solving We concur and recommend that a focus on data be a is a need for creative approaches is important for all data analysts, not just undergradu- major component of introductory and advanced statistics to ensure they have appropriate ate statistics majors or minors. It needs to start in the courses and that students work with authentic data through- background (see, for example, the 2014 ASA/MAA guidelines for first course and be a consistent theme in all subsequent out the curriculum. Institutions should ensure that modern teaching statistics, http://magazine. courses. Often, there is more than one appropriate way to applied statistics courses are available early in the curricu- amstat.org/blog/2014/04/01/ address a research question. Students need to see that the lum. These courses are particularly relevant for strong math- asamaaguidelines). discipline of statistics is more than a collection of unre- ematics students and have the potential to recruit students 9 See www.tinyurl.com/cupm2004 lated tools (or methods); it is a general approach to prob- into statistics and other mathematical sciences programs. It and the 2015 guidelines at www. maa.org. lem solving using data. Undergraduates need to develop is also essential that faculty developing statistics curricula judgment to assess approaches and verify assumptions, and teaching courses be trained in statistics and experienced

6 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science of information collected in our increasingly data-cen- 10 The 2014 ASA/MAA guidelines 14 See, for example, the for teaching introductory National Academies report “The tered world and to manage data, analyze it accurately, statistics (http://magazine. Mathematical Sciences in 2025,” 14 and communicate findings effectively . This capaci- amstat.org/blog/2014/04/01/ www.nap.edu/catalog.php? ty has been elegantly described by Diane Lambert of asamaaguidelines) make the same record_id=15269. recommendation. Mathematical 15 15 See also the ASA report Google as the ability to “think with data.” Although expertise is not a substitute. a formal definition of data science is elusive, we con- “Discovery with Data: Leveraging 11 There is not a single definition Statistics with Computer Science cur with the StatsNSF committee statement that of what is appropriate as a to Transform Science and Society,” data science comprises the “science of planning for, second course in statistics, and a www.amstat.org/policy/pdfs/ acquisition, management, analysis of, and inference number of options can be found BigDataStatisticsJune2014.pdf, 10 16 at many institutions. No matter and “Thinking with Data: How to in working with data . Instructors need additional materi- from data.” how innovative the approach, Turn Information into Insights” by als that feature real applications, including curated data sets, With increasingly large data, the relative importance we believe it is not possible Shron (2007). to develop a comprehensive sample syllabi, and other resources. of statistical topics changes. Methods that find patterns 16 See www.nsf.gov/ understanding of the range of attachments/130849/public/ More generally, undergraduate statistics programs and relationships in high-dimensional data become more key statistical concepts after only Stodden-StatsNSF.pdf. should emphasize concepts and approaches for working important, as do methods to avoid bias from available two courses. 17 There are numerous examples with complex data and provide experiences in designing data. Model assessment remains critical, while statistical 12 There are many electives of software packages that can be studies and analyzing real data (defined as data that have significance is less central. that might be included in a used for introductory statistics statistics major. As resources been collected to solve an authentic and relevant problem) In previous decades, it was often sufficient for under- courses, including JMP, Minitab, R/ will vary among institutions, the RStudio, SAS, SPSS, and Stata. that go well beyond the content of a second course in sta- graduate majors in statistics who had knowledge of statisti- identification of what will be 11 tistical methods . The detailed statistical components of cal software to successfully navigate analytic tasks assigned offered is left to the discretion of 18 We define this as a these problem solving skills may vary, but should be tightly to them. Students now need facility with professional sta- individual institutions. programming environment 17 that supports abstraction from 13 Data from a survey of integrated with study in statistics, data wrangling, comput- tistical analysis software , the ability to access and manip- a specification to the computer 12 graduates from California ing, mathematics, and, ideally, a field of application . ulate data in various ways, and the ability to use algorith- (e.g., hides many aspects of Polytechnic State University, the underlying computational mic problem solving. They need to learn to pose relevant San Luis Obispo (Melissa Bowler, environment), such as Python, Focus on problem solving: Undergraduate programs questions (to gain insight), use a variety of computational unpublished senior project) R, or SAS. in statistics should equip students with problem solving approaches to extract meaning from data, judge data qual- found that 60% of bachelor’s graduates eventually completed 19 See Nolan and Temple Lang skills they can effectively apply, build on, and extend over ity, assess their models and methods, and communicate a graduate degree, but often not (2010) “Computing in time. They should teach principles that will allow grad- results in a comprehensible and correct fashion. With data until many years in the work force. the statistics curriculum,” uates to ask questions, assess their work, and learn new now taking all shapes and formats, statistics majors need to A study of graduates from Eastern The American Statistician, ideas as needed. Many bachelor’s graduates seek employ- 18 and fluent- Kentucky University found similar 64(2):97–107 for an overview be able to program in higher-level languages of key curriculum topics in 13 results (see Kay and Costello, ment immediately after their degree . Some flexibility in ly interact with database systems. The additional need to JSM 2014, and Costello and data science. Kay (2002), “Where do all of the the undergraduate curriculum is needed, as the appropri- think with data—in the context of answering a statistical 20 See also the white paper by undergraduate statistics majors ate skill-set for those seeking employment immediately question—represents the most salient change since the prior Hardin et al., “Data Science in the go?,” STATS, 34:10–13). Additional 19 Statistics Curricula: Preparing upon graduation may differ from those seeking admission guidelines were endorsed in 2000 . studies of graduates and their Students to ‘Think with Data.’ “ into doctoral programs in statistics. Adding these data science topics to the curriculum early career profiles would be necessitates developing data, computing, and visual- valuable for the community. The increasing importance of data science: Statistics ization capacities that complement more traditional students need to make sense of the staggering amount mathematically oriented statistical skills20.

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 7 The main goal of our recommendations is to ensure 21 It will be increasingly important to engage with faculty involved We recognize the hurdles and challenge at all ends of the in computing education (e.g., undergraduate statistics students spectrum to ensure that students are provided with modern members of the Association for statistical experiences. For programs that are already imple- Computing Machinery Special Interest Group in Computer Science remain useful in a world with menting computational courses, faculty should be encour- Education, SIGCSE) to learn of their aged to share resources, make course content available, and experiences and approaches to increasingly more complex data. help train the next generation of teachers and scholars. For teaching “computational thinking.” programs that are unable to implement an entire major 22 This is by no means new advice. The 35-year-old ASA report program, we suggest that missing topics or skills be added “The Training of Statisticians for to classes in the current curriculum. Additional co- and Industry” (1980, The American Creative approaches to new curricular needs: Many extra-curricular experiences that enhance the formal statis- Statistician, 34(2):65–75) describes programs will require considerable creativity to fully inte- tics curriculum should be embraced and encouraged. skills for an effective industrial statistician, particularly the role grate additional data-related and statistical practice skills of communication taught in into the curriculum. Relationships with allied disciplines Relationship with mathematics: Though the practice of conjunction with technical topics. that teach applied statistics, and with computer science, statistics requires mathematics for the development of its Then and now, if students develop a clear understanding of basic will become increasingly important. A number of data underlying theory, statistics is distinct from mathematics statistical theory that allows them science topics need to be considered for inclusion into and requires many nonmathematical skills. Few under- to select, use, and assess a model, introductory, second, and advanced courses in statistics graduate statistics students need the mathematics used it is more likely they can learn to effectively use other approaches to ensure that students develop the ability to frame and to derive classical statistical formulas, many of which are that they were not exposed to in answer statistical questions with rich supporting data early often superseded by computational approaches that are college (or did not exist before they in their programs, and move towards dexterous ability to more accurate and may better facilitate understanding. graduated). compute with data in later courses21. To make room, some Theoretical/mathematical and computational/simulation 23 This is also true for more traditional topics will need to be dropped from the core approaches are complementary, each helping to clarify theoretical master’s programs in statistics and applied mathematics. curriculum. We do not attempt to specify which topics understanding gained from the other. Students planning are central, and which could be covered in electives or doctoral study in statistics need a strong background in 24 See the white paper “Roadmap for Smaller Schools” by Hoerl. dropped entirely. Given that most undergraduate statistics mathematics and theoretical statistics in addition to strong majors enter the workforce as analysts where data-skills are computing skills23. primary, we suggest that helping them to master a smaller set of methods, rather than a comprehensive laundry list, Flexibility: Institutions vary greatly in the type and is likely to be more useful to them in the long-term22. breadth of programs they are able to offer, but the ASA The main goal of our recommendations is to ensure believes almost all institutions can provide a level of undergraduate statistics students remain useful in a world statistical education that is useful to both students and with increasingly more complex data. If we don’t prepare employers. Programs should be sufficiently flexible to them to learn new techniques and work with various accommodate varying student goals. Institutions should forms of data, it will be difficult for them to compete for adapt these guidelines to meet the needs of their students, jobs. We need to pay attention to the core foundations potentially with tracks within a single program24. Each of statistical thinking and practice without shying away institution should regularly review their programs to from increasingly important data science skills. reflect new developments in this fast-moving field.

8 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science 25 Ideally such a program would culminate with capstone and/or internship experiences. SKILLS NEEDED 26 We anticipate departments can use these high-level categories to define program ffective statisticians at any level need to master an outcomes. integrated combination of skills built upon statis- 27 Our enumeration of key tical theory, statistical application, data manage- statistical skills is intentionally ment,E computation, mathematics, and communication. short, since these are likely It cannot be assumed that beginning students fully com- most familiar to the statistics community. More detail is prehend these myriad connections, and an appropriate provided regarding computation developmental progression is required to obtain mastery. and data-related skills because Providing students with a strong foundation in statistical they have not played as large part in the undergraduate methods and theory is critically important for all under- statistics curriculum in the past. graduate programs in statistics. These skills need to be We reiterate, however, that introduced, supported, and reinforced throughout a stu- statistical fundamentals are at the dent’s academic program, beginning with introductory core, with the data-related skills supporting the ability to analyze 25 courses and augmented in later classes . Such scaffolded and interpret complex data. exposure helps students connect statistical concepts and 28 This capacity includes the theory to practice. ability to write functions and We have not specified a minimum number of classes use control flow in a variety of languages and tools such as (or equivalent) expected in each area, though programs Python, R, SAS, or Stata. Facility need to provide preparatory, introductory, intermedi- with spreadsheet tools such as ate, and advanced skill development with an integrated Excel is useful for a variety of approach. Ideally, there should be many opportunities for other purposes, but is not ideal as a programming or reproducible topics and concepts that cut across numerous classes to be analysis environment. referenced and integrated in multiple places within the 29 The capacity to undertake curriculum. Statistics programs should provide majors communicating results. They need a foundation in and interpret simulation studies with sufficient background in the following areas26: theoretical statistics principles for sound analyses. as a way to complement analytic understanding and/or check results will be increasingly useful Statistical methods and theory: Graduates should be Data management and computation: Graduates in the workplace. able to design studies, use graphical and other means should be facile with professional statistical software and to explore data, build and assess statistical mod- other appropriate tools for data exploration, cleaning, els, employ a variety of formal inference procedures validation, analysis, and communication. They should (including resampling methods), and draw appropriate be able to program in a higher-level language28, to think scope of conclusions from the analysis27. They need algorithmically, to use simulation-based statistical tech- knowledge and experience applying a variety of sta- niques, and to undertake simulation studies29. Graduates tistical methods, assessing their appropriateness, and should be able to manage and marshal data, including

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 9 30 Many graduate programs 33 Data from a survey of graduates strongly recommend at least a from California Polytechnic State year of mathematical analysis and/ University, San Luis Obispo (Melissa or advanced calculus, while other Bowler, unpublished senior project) upper-level mathematics courses was used to generate a listing of such as “Stochastic Processes,” current jobs for n=62 graduates “Graph Theory,” “Differential from Cal Poly San Luis Obispo’s Equations,” “Optimization,” undergraduate statistics program. “Combinatorics,” and “Algebraic She found that 12 had “statistic” in Statistics” also may be helpful. the title (e.g., Statistician, Senior Statistical Analyst, Statistical 31 One possible model to develop Programmer I) while 20 had “‘analy” project management skills can in the title (e.g., Data Analyst, be found in the University of Marketing Data Analyst, Research California/Berkeley CS169 “Software Analyst, Business Systems Analyst). Engineering” course, which We suspect many of those with incorporates a substantive project “Statistician” in their job title with external customers structured completed a higher degree. Better with four two-week-long iterations data on the outcomes of graduates with those clients (www.armando would benefit the profession as a fox.com/2012/05/10/about whole. -uc-berkeley-cs169-software -engineering). Another approach 34 The interaction of statisticians would be to incorporate project with subject-matter professionals is management into capstone a key characteristic of the discipline, experiences. as statistics is increasingly a “team sport.” This is particularly important 32 There is pedagogical value joining data from different sources and formats and communicate complex statistical methods in basic terms at the planning stage of a study in having students practice or project. Graduates need to restructuring data into a form suitable for analysis. Their to managers and other audiences and visualize results in communication to identify 32 translate subject-matter objectives statistical analyses should be undertaken in a well-docu- an accessible manner . Undergraduate majors in statis- gaps in their understanding. In to statistical plans and analyses addition, communication skills mented and reproducible way. tics often will be hired into analyst positions, where they that mesh with and are capable of need to dovetail with students’ need to be able to understand and communicate statisti- meeting those objectives. Depth technical and statistical knowledge: 33 in a substantive area provides the Mathematical foundations: Graduates should be able cal findings . Excellent communication of capability to engage in this manner. inappropriate or incorrect analyses to apply mathematical ideas from linear algebra and is counterproductive. 35 See, for example, the analytics calculus to statistics, and to set up and apply probabil- Discipline-specific knowledge: Students should be co-major in the department of ity models. Minor programs will generally require less able to apply statistical reasoning to domain-specific statistics at Miami University, www.miamioh.edu/cas/academics/ study of mathematics. Students preparing for doctoral questions. This capacity includes translating research departments/statistics/academics/ work in statistics should usually complete additional questions into statistical questions and communicating majors/analytics-comajor. mathematics courses30. results appropriate to different disciplinary audiences. Because statistics is a methodological discipline, statis- Statistical practice: Graduates should be expected to tics programs should encourage study in a substantive write clearly, speak fluently, and construct effective visu- area of application34. Some programs might include a al displays and compelling written summaries. They required second major, co-major35, minor, or sequence should demonstrate ability to collaborate in teams and to of related courses to accompany the completion of a sta- organize and manage projects31. They should be able to tistics degree.

10 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science 36 Given how quickly the 41 These topics will generally discipline of statistics is changing, include many of the following: it is not feasible or appropriate simple and multiple linear to attempt a comprehensive regression, generalized linear CURRICULUM FOR STATISTICS MAJORS overview of the entire field at the models, generalized additive undergraduate level, and we do models, time series, mixed not attempt it. models, survival analysis, spatial Statistical Methods and Theory analysis, regression trees, model 37 Resampling methods selection, diagnostics, cross- (including bootstrapping and Statistical thinking begins with a problem and validation, and regularization. permutation tests) are widely explores data to answer key questions. Undergraduate applicable to many problems at 42 We note the growing use statistics students need a deep understanding of fun- multiple levels of the curriculum. of machine learning methods damental concepts as well as exposure to a variety of See the white paper “What (which also arise in analytics) topics and methods36, including the following: Teachers Should Know About to make predictions about the Bootstrap: Resampling In future events. See Breiman the Undergraduate Statistics (2001), “Statistical modeling: • Statistical theory (e.g., distributions of Curriculum” by Hesterberg. The two cultures,” Statistical Science, 16(3):199–231; Harville 38 These topics include advanced random variables, likelihood theory, point (2014), “The need for more visualization techniques, and interval estimation, hypothesis testing, emphasis on prediction: A smoothing/kernel estimation, ’non-denominational’ model- decision theory, Bayesian methods, and spatial methods (see manuscript 37 based approach,” The American resampling ) by Christou ,“Enhancing the Statistician, 68(2):71–92 and Teaching of Statistics Using related discussion; and the white • Exploratory data analysis approaches and Spatial Data”), and mapping. papers on a model data science 38 We also note the value of graphical data analysis methods course (Baumer) and “Data visualization early in the analysis Science in the Statistics Curricula: process to identify errors and • Design of studies (e.g., random assignment, Preparing Students to ‘Think with anomalies. random selection, data collection, and Data’” (Hardin et al.). efficiency39) and issues of bias, causality, 39 Other important topics 40 include blocking, stratification, confounding , and coincidence survey sampling, and adaptive • Statistical models (e.g., variety of linear and designs. nonlinear parametric, semiparametric, and 40 Issues of confounding and 41 causal inference are central to nonparametric regression models ; model the discipline of statistics. There building and assessment; multivariate are many settings in which a methods; and statistical and machine randomized experiment cannot learning techniques42) be undertaken. To avoid pitfalls of drawing conclusions from observational data, students need a clear understanding of Data Wrangling and Computation principles of statistical design Undergraduate statistics majors need facility with and tools to assess and account computation to be able to handle increasingly com- for the possible impact of other measured and unmeasured plex data and sophisticated approaches to analyze it. variables. Graduates need the ability to manage and restructure

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 11 43 See Zhu et al. (2013) “Data 48 This recommendation is acquisition and pre-processing in consistent with the efforts data. Such skills underpin strategies for assessing and studies on humans: What is not of Conrad Wolfram and the ensuring data quality as part of data preparation and taught in statistics classes,” The Computer-Based Math initiative, are a necessary precursor to many analyses43. American Statistician, 67(4):235–241, www.computerbasedmath.org which includes a series of skills: (1) and www.tinyurl.com/ted-wolfram. • Use of one or more professional statistical soft- get to know the study; (2) assess the The incorporation of these tools ware environments44 validity of variable coding; (3) assess may be particularly valuable at data entry accuracy; (4) perform the bachelor’s level, since students • Data management using software in a well-docu- data cleaning; and (5) edit identified will generally have less technical 45 data errors. knowledge (and need to be able to mented and reproducible way , data processing simulate to generate insights and/ in different formats, and methods for addressing 44 Although we acknowledge or check analytic results). that Microsoft Excel is a common missing data platform for data exchange, we 49 Students should develop the do not recommend it as a primary capacity to manipulate formats • Basic programming concepts (e.g., breaking a analysis environment. such as CSV, JSON (JavaScript problem into modular pieces, algorithmic think- Object Notation, a data-interchange 46 47 45 Appropriate environments ing , structured programming , debugging, format that is easy to read, could include R, Python, and SAS, parse, and generate; see Nolan and efficiency) complemented by tools including and Temple Lang (2014) XML shell scripts and knitr. and Web Technologies for Data • Computationally intensive statistical meth- 46 Futschek (2006) defines Sciences with R), XML, databases ods (e.g., iterative methods, optimization, algorithmic thinking as a set of (see, for example, Ripley (2001) resampling, and simulation/Monte Carlo abilities related to constructing “Using databases with R,” R News, methods)48 and understanding algorithms: 1(1):18–20 and Wickham (2011) (1) the ability to analyze a given “ASA 2009 Data Expo,” Journal • Use of multiple data tools49, so graduates are not problem; (2) the ability to precisely of Computational and Graphical specify a problem; (3) the ability Statistics, 20(2):281–283), and text wedded to one and are better able to learn new 50 to find the basic actions that are data. Because many faculty were technologies adequate to the given problem; not trained in these technologies, (4) the ability to construct a correct continuing education in this area algorithm to a given problem needs to be made a priority. using basic actions; (5) the ability Mathematical Foundations • Probability (e.g., properties of univariate and 50 We are not prescriptive to think about all possible special regarding which technologies are The study of mathematics lays the foundation for sta- multivariate random variables, discrete and con- and normal cases of a problem; 52 incorporated into the curriculum, as tistical theory. Undergraduate statistics majors should tinuous distributions) and (6) the ability to improve the long as they are sufficiently flexible efficiency of an algorithm. Futschek, have a firm understanding of why and when statistical and powerful. Many undergraduate G. (2006). “Algorithmic thinking: The • Emphasis on connections between concepts in statistics students develop expertise methods work. They should be able to communicate key for understanding computer these mathematical foundations courses and in environments such as R/RStudio, in the language of mathematics and explain the inter- science,” in R. Mittermeir (Ed.), 53 Python, and SAS. their applications in statistics Informatics Education–The Bridge play between mathematical derivations and statistical Between Using and Understanding 51 Multivariate calculus is applications. Computers (Vol. 4226, pp. 159–168). recommended. Statistical Practice Berlin/Heidelberg: Springer. We • Calculus (e.g., integration and 52 Markov chains are a useful 51 consider this to be a necessary, differentiation) Strong communication skills complement technical topic for undergraduate majors in but not sufficient component of knowledge and are particularly necessary for statisticians; statistics. “computational thinking.” • Linear algebra (e.g., matrix manipulations, lin- graduates need technical skills to perform analyses and 53 This linkage includes topics such 47 We define structured ear transformations, projections in Euclidean as the delta method. In addition, communication skills to understand clients’ needs and programming as the ability to use space, eigenvalues/eigenvectors, and matrix many students might benefit then effectively discuss results and conclusions. Important functions and control structures (e.g., from exposure to modeling and decompositions) “for” loops). practical skills include the following: simulation in their mathematics courses as a way to reinforce their computational skills.

12 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science 54 See the whitepaper “Ethics and 59 While the GAISE college report the undergraduate experience should include the Undergraduate Curriculum,” (www.amstat.org/education/ opportunities for internships55, senior-lev- by Cohen, and “Seeing Through gaise) focuses on the introductory el capstone courses56, consulting experiences, Statistics” (Chapter 26, Ethics in statistics course, many of its 57 Statistical Studies), Utts (2015), tenets are broadly applicable research experiences, or a combination . These which includes topics such as the for the principled teaching of and other ways to practice statistics in context ethical treatment of human and statistics. animal participants, assurance of should be included in a variety of venues in an 60 Our experience has been that data quality, appropriate statistical programs that require work with undergraduate program. analyses, and unbiased reporting real data in the first year or two of results. tend to be able to offer more Pedagogical Considerations 55 See the white paper substantive real experiences The approach to teaching this curricu- “Undergraduate Internships in (e.g., advanced data analysis or lum should model the correct application Statistics” by Cohen. capstones) in later years. of statistics58: 56 See the white paper 61 The American Statistical “Capstones in the Undergraduate Association issued a statement • Emphasize authentic real-world data Statistics Curricula” by Malone. on continuing professional development (www.amstat.org/ and substantive applications related 57 A number of innovative 59 education/cpd.cfm). Statisticians to the statistical analysis cycle programs have been created are encouraged to undertake in recent years to address the continuing professional • Develop flexible problem solving skills need to provide undergraduate development: (1) in methodology statistics students with authentic and practice, by keeping abreast experiences posing and • Present problems with a substantive of new techniques and theory, answering statistical questions. context that is both meaningful to staying connected with best These include DataFest (http:// practice, growing in areas not students and true to the motivating chance.amstat.org/2013/09/ previously studied (or refreshing research question classroom_26-3), Explorations in forgotten material), and gathering Statistics Research (see the draft ideas and direction for future • Include experience with statistical manuscript by Nolan et al.), the research; (2) in technology, by Summer Institutes in Biostatistics computing and data-related skills early learning about new computational 60 (www.nhlbi.nih.gov/funding/ and often techniques and software tools • Effective technical writing, presenta- training/redbook/sibsweb.htm), and by staying on top of trends in and other Research Experiences tion skills, and visualizations • Encourage synthesis of theory, methods, technology and new sources of for Undergraduates (REUs). computation, and applications data that are creating major new • Teamwork and collaboration 58 Just watching instructors opportunities for statisticians; • Provide opportunities to work in analyze data is insufficient. (3) in subject matter needed for • Ability to interact with and commu- teams Students need repeated successful collaboration with other nicate with a variety of clients and experiences undertaking disciplines, to strengthen the collaborators analysis of real-world data. interdisciplinary contributions and • Integrate training in professional capabilities of statisticians; and 61 It is also important that conduct and ethics instructors have a history of such (4) in career success factors such Undergraduate curricula must provide ample experiences (see the 2014 ASA/ as communication, leadership, opportunities to practice the work of being a stat- • Offer frequent opportunities to MAA guidelines for teaching and influence skills, which are istician. The completion of such requirements in refine communication skills, tied statistics, http://magazine. vital to the impact of individual directly to instruction in technical amstat.org/blog/2014/04/01/ contributions and the visibility of statistics can help ensure that graduates have the asamaaguidelines). our profession. necessary skills to work as practicing statisticians. statistical skills Ethical issues should be incorporated • Incorporate regular assessment 54 throughout a program . Whenever possible, to provide authentic feedback.

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 13 62 See Cannon et al. (2001), “Guidelines for undergraduate minors and concentrations in statistical science,” Journal of CURRICULUM TOPICS FOR , 10(2), www. amstat.org/publications/jse/v10n2/ cannon.html. 63 There is a pressing need for MINORS OR CONCENTRATIONS additional K–12 teachers with the capacity to teach the Common Core t is challenging to develop the capacity to be able to State Standards for Mathematics. tion See the forthcoming ASA SET analyze data in the manner we describe within the visualiza (Statistical Education of Teachers) constraints of an undergraduate program that might data science report for specific guidance. includeI 10–12 courses. These issues are even more dif- 64 A minor in mathematical ficult to address for minor programs or concentrations, statistics also could be considered, but it may be challenging to ensure which typically feature a much smaller number of courses data 62 students develop sufficiently strong as part of their requirements . process computational and data-related In some cases, however, statistics minors or concentra- data skills. A concern is that an emphasis tions for quantitatively oriented students in fields such as on probability and inference may mining tistics leave these students less prepared biology, mathematics, business, and behavioral and social Sta for the job skills expected by science or those planning to teach at the K–12 level may employers. be more feasible than a full statistics major63. Institutions 65 A capstone also might include need to design such programs to ensure graduates possess a machine pattern experience in another content area core set of useful skills. These programs will necessarily be reco (e.g., health, education, business, ning gnition sociology, or biology). See also more varied than major programs. The core of a minor or lear the white paper “Capstones in the concentration in statistics should consist of the following: develop significant data-related skills, understanding of Undergraduate Statistics Curricula” • General statistical methodology (e.g., statistical key statistical concepts, and perspective on the field of by Malone. 64 thinking, descriptive statistics, graphical display, statistics . The number of credit hours for minors or estimation, testing, resampling) concentrations will depend upon the institution. Additional topics to consider include applied regres- • Statistical modeling (e.g., simple and multiple sion, design of experiments; statistical computing; data regression, confounding, diagnostics) science; theoretical statistics; categorical data analysis; • Facility with professional statistical software, time series; Bayesian methods; probability; database sys- along with data management skills tems; and a capstone, internship, or similar integrative experience65. Ethics is another key topic to integrate into • Multiple experiences analyzing data and com- these courses. For many students, a methods course in an municating results application area might be an appropriate option. Courses The recommendations for minors and concentrations from other departments with substantial statistical con- focuses on statistical fundamentals, data technologies, tent might be allowed to count toward a statistics minor and communication and is intended to ensure students or concentration.

14 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science 66 In 1997 (the first year it was 71 See the white paper “The Key offered), a total of 7,667 students Role of Community Colleges took the Advanced Placement to Support the Undergraduate ADDITIONAL POINTS Statistics exam. This number Teaching of Statistics” by Horton increased to 98,033 in 2007 et al. and the California online and has increased to more than student-transfer information Relationship with high-school and community 185,000 in 2014, making it one of website, www.assist.org/ college courses in statistics: The dramatic growth the top 10 largest AP exams. web-assist/welcome.html. Revision of lower-division introductory 67 See www.corestandards.org/ of the number of students completing the Advanced statistics courses to introduce Math/Content/HSS/introduction 66 computing and data science Placement Statistics course and the augmented role for for an overview of statistics and skills will facilitate articulation statistics as part of the Common Core State Standards probability topics at the high- in the future. A major challenge school level. for Mathematics have increased the exposure of the dis- will be faculty development for 67 cipline at the high-school level . As a result, colleges 68 A number of innovative instructors in two-year colleges. approaches have been suggested and universities may need to re-evaluate their introduc- 72 The ASA Guidelines for for this problem. One approach 68 Master’s Programs in Statistics tory courses . is to consider a year-long (http://magazine.amstat.org/ The number of students studying introductory sta- introductory statistics course at blog/2013/06/01/preparing the undergraduate level, in which tistics courses at two-year (community) colleges has -masters) recommend: students who have completed increased to more than 137,000 per year69 (larger than (1) Graduates should have a solid Advanced Placement Statistics foundation in statistical theory the total enrollment in calculus classes at this level, up would begin in the second and methods; (2) Programming semester. Offering a sequence of from a previous ratio of 10 calculus sections per statistics skills are critical and should be courses (e.g., Applied Statistics I section in the 1960s). This shift reflects the belief that infused throughout the and Applied Statistics II) would graduate student experience; statistics is a universal discipline, not just needed for a facilitate integration of additional (3) Communication skills are topics that are not feasible within handful of students, but required for a number of disci- critical and should be the syllabus of a single course. plines and recommended for many others. developed and practiced Anecdotal evidence suggests many statistics majors are 69 See the Conference Board of throughout graduate programs; transferring to universities from community colleges70. A the Mathematical Sciences 2010 (4) Collaboration, teamwork, and report (www.ams.org/profession/ leadership development should key question is how to facilitate this transfer and ensure data/cbms-survey/cbms). be part of graduate education; (5) Students should encounter students can successfully undertake preliminary course- 70 Data from University of non-routine, real problems work and general education requirements prior to com- California/Berkeley indicate throughout their graduate these numbers are large and pleting a statistics degree at another institution. Further education; and (6) Internships, increasing (Deb Nolan, personal co-ops, or other significant efforts are needed to streamline articulation agreements communication). with community colleges and to support faculty develop- attractive option as a liberal arts degree. Both bache- immersive work experiences 71 should be integrated into ment and curricular development at two-year colleges . lor’s and master’s graduates are needed to help address graduate education. We note that the shortage of workers with the skills to make evi- many of these recommendations Relationship with master’s programs in statistics: dence-based decisions informed by data. also apply to undergraduate Graduates from undergraduate programs in statistics There are differences between the learning outcomes programs. are generally employable as analysts or in similar posi- of master’s programs and bachelor’s programs related tions that use a number of statistical skills. In addition, to level, breadth, and depth72. There has been the pre- a bachelor’s degree can and should be considered an sumption that master’s graduates are statisticians and

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 15 73 This may be due to 73 undergraduate programs in undergraduates are not . We disagree. Bachelor’s pro- in turn often necessitates three semesters of calculus as a statistics being rare until recently. grams should prepare students to be practicing statisti- prerequisite, students often take the theoretical statistics The growth in availability of data cians74. Programs with both master’s and bachelor’s pro- course late in their programs. This sequence precludes science jobs, many of which just require a bachelor’s, may change grams can provide bachelor’s students access to master’s other upper-level applied statistics courses building on this impression. courses. Five-year master’s programs (in which students this important theoretical foundation. Some institutions 74 Most undergraduate programs simultaneously receive a bachelor’s and master’s degree) have been successful in providing students with earlier are not intended to train accredited may be an attractive option for many students. access to the theoretical underpinnings of the discipline. (professional) statisticians, though some graduates may reach this level The growing number of statistics graduates at the For example, Brigham Young University splits the tra- through work experience or further bachelor’s level may have implications for the structure ditional probability course into two courses: one with study. The American Statistical of and content for master’s programs75. Further efforts a focus on probability and discrete random variables Association allows graduate to assist with professional development and continuing (taught at the sophomore level) and a more advanced statisticians not yet eligible for accreditation because of a lack of education are needed to help ensure that bachelor’s grad- course (with a focus on continuous random variables experience to be designated GStat uates have the requisite skills to stay engaged with new and additional advanced topics in inference). (stattrak.amstat.org/2014/05/01/ developments in the field of statistics. gstat). We recommend the development of a similar pathway Learning outcomes and assessment: There is a grow- for those with training at the Teaching the theoretical underpinnings of statis- ing awareness of the importance of learning outcomes undergraduate level. tics: Understanding the theoretical underpinnings (a detailed list of what a student is expected to know, 75 In particular, it may be feasible of statistical methods is a vital component of modern understand, and demonstrate after completing a pro- for master’s students with a BS in statistical practice. While we do not presume to spec- gram) and assessment of these learning outcomes77. Many statistics to get their MS in statistics in a substantially shorter time, or ify how the ideal statistical theory course (often called internal and external groups (such as accreditors, legisla- undertake more advanced course “Mathematical Statistics” or “Statistical Inference”) tors, parents, and students) are calling upon institutions work as part of their master’s should be structured, we do believe that aspects of the to demonstrate accountability by defining learning goals degree. traditional probability/inference sequence, with its and objectives at the program level (in addition to the 76 See www.amstat.org/sections/ emphasis on large sample size approximations and lists course level) and devising strategies for assessing whether educ/MathStatObsolete.pdf for details. of distributions, does not fully capture current statistical these goals and objectives are being met. 77 See the white paper by Chance practice. A lively panel discussion from JSM 2003 raised 76 and Peck “From Curriculum many relevant issues . A modern statistical theory course Assessments can be structured in a number of ways. Guidelines to Learning Objectives: A might, for example, include work on computer-intensive They can be direct (e.g., tests and projects) or indirect (e.g., Survey of Five Statistics Programs.” methods and non-parametric modeling. Such a course surveys and focus groups). For higher-order thinking skills, 78 They should be “authentic.” should provide students with an overview of statistics which encompass much of a statistics program, assessments See, for example, Gould (2010) 78 “Statistics and the modern student,” and statistical thinking that builds on their introductory should be relevant, open-ended, and complex . A sound International Statistical Review, statistics courses. It may be useful to incorporate com- assessment plan will include indication of where (which 78(2):297–315 and Brown and puting, data-related, and communication components in courses?, which experiences?) students are expected to Kass (2010) “What is statistics?,” The American Statistician, 63(2):105–110 this class. If included early on in a student’s program, it develop the skills, and when they are expected to be intro- and associated discussion and will help to provide a solid foundation for future courses duced to, practice, and master the skills. Further work rejoinder. and experiential opportunities. Because the traditional is needed to identify appropriate learning outcomes and mathematical statistics course requires probability, which assessment strategies for statistics programs.

16 American Statistical Association | Curriculum Guidelines for Undergraduate Programs in Statistical Science NEXT STEPS CLOSING These guidelines are intended to provide an hrough our process and discussions, a number of overview of a principled approach to ensure issues arose that we believe merit further explora- tion over the coming years, but were outside our that undergraduate statistics majors have the Tpurview, including the following: appropriate skills and ability to tackle complex Faculty development: The American Statistical and important data-focused problems. Additional Association strategic plan describes the importance of fac- resource materials and an annotated bibliography ulty development. Efforts to create and share additional activities, projects, sample syllabi, and model courses will are available at www.amstat.org/education/ be useful for faculty teaching this curriculum. curriculumguidelines.cfm.

Engagement with two-year colleges: Community colleges are a large, growing, and increasingly import- ant component of the United States higher education system. Additional efforts are needed to coordinate statistics instruction at the two-year college level, raise the profile of statistics majors at these institutions, and facilitate articulation agreements for transfer to four- Multiple pathways for introductory statistics: This year institutions. might be an opportunity for the ASA to lead an effort to reassess curricula for a variety of introductory statis- Surveys of graduates and employers: Better informa- tics courses. This effort might include delineating model tion on the career paths of the growing number of under- courses for students at two-year colleges, students who graduate statisticians in the work force and surveys of have completed AP Statistics, and/or those planning to employers would help to guide future curricular changes. major in statistics.

Certification/accreditation pathway: The ASA now Periodic review: While we have endeavored to provide a provides an entry level pathway for master’s level statis- flexible yet specific document, the fast-changing nature of ticians without sufficient work experience to prepare for the discipline of statistics will necessitate a review of these professional accreditation. While the ASA professional curriculum guidelines. Undertaking such a regular review statistician accreditation requires an advanced degree, we at least every five-eight years would be warranted. believe that there may be other types of certification or We encourage the wider statistical and mathemati- accreditation that might be appropriate for workers with cal sciences community to explore next steps for each of a bachelor’s degree. these items.

Curriculum Guidelines for Undergraduate Programs in Statistical Science | American Statistical Association 17