Volume 26 issue 1 [American Statistical Association Sections of Statistical Computing and Statistical Graphics] December 2016 Statistical Computing & Statistical Graphics newsletter A joint newsletter of the Statistical Computing & Statistical Graphics Sections [To keep members of our community informed of our section news.] of the American Statistical Welcome to Statistical Computing and Association Statistical Graphics newsletter News from our Chairs: David Poole, Statistical Computing, and Michael Kane, Statistical Graphics, both provide their insights:

It is hard to believe that 2016 is al- poster presentations. Jointly with most over! As many have noted, Graphics, we honored Bill Cleveland this continues to be the era of big with the Statistical Computing and data and data science, particularly Graphics Award, in recognition of his in the popular media, but now also many seminal contributions to both in industry and academia, with for- computing and visualization. In addi- mal degree programs in Data Sci- tion, our Section was a co-sponsor ence increasingly on offer. While of the highly successful 2016 Women there may some disagreement on in Statistics and Data Science Con- the precise definition of a data sci- ference, which was held in Charlotte, entist, it is clear that the most suc- NC. DAVID POOLE, cessful practitioners of data sci- Preparations for JSM 2017 in Balti- ence possess a unique combina- CHAIR OF more are well underway. The invited tion of analytic and computational program is complete, but there is still STATISTICAL skills. Too much reliance on one time to submit proposals for topic- COMPUTING skill set, at the expense of the contributed sessions (by January 11) other, is an impediment. To this and regular contributed sessions (by SECTION end, Statistical Computing, and its February 1). Volunteers for session role in the creation of new comput- chairs are always welcome! The ing environments, can play a key Computing Section is also co- role in the data science space. sponsoring a 2017 Data Challenge Working environments that pro- organized by the Section on Govern- mote seamless integration of mod- ment Statistics. Contestants will ana- eling and visualization tools (such lyze a government dataset and pre- as those we typically use in ) sent their findings at the JSM. Please Table of Contents with big data and distributed com- see elsewhere in this newsletter for puting tools (such as Hadoop, Chair News…………………….1-2 more information. We also plan to Spark, HBase, and so on) will be hold another joint Computing/ Program Chair news…………...3-5 crucial in the ongoing evolution of Graphics Data Expo competition in data science as a discipline. There Secretary/Treasurer reports..…….5 2018. Further announcements about is also a need for existing algo- that dataset will follow in 2017. Computing awards news………...6 rithms and models to adapt to the Last, but certainly not least, I would like to Council of Sections news...... 7 computational requirements of big thank all our 2016 Section Officers for data. News from R Studio & article…...8 their hard work this year, in particular our current awards chair Patrick Breheny and Computing news from SAS…..9-10 The Computing Section presented our outgoing newsletter editor Usha Go- Announcement & Notes……… 11 a very successful program at the vindarajulu. Both have contributed greatly 2016 Joint Statistical Meetings in to our successful year. List of officers…………………. 12 Chicago. We sponsored no less David Poole than 9 invited sessions, 4 topic- contributed sessions, and over 75 Section on Statistical Computing contributed talks, in addition to our involvement with the various- Section Chair (2016)

Statistical Computing & Statistical Graphics [ASA ] person to identify and prescribe the skill sets needed to effectively un- derstand and analyze data. Those of us who have been fortunate enough to work and collaborate with Bill have benefitted greatly from his experience and perspec- tive. We are currently preparing for next year’s meeting in Baltimore. The in-

vited papers for JSM 2017 are sub-

mitted, and the awards committees

are evaluating nominations for the

Student Paper Competition and the

John Chambers Statistical Software

award. We will once again honor

the winners at our annual joint sec- MICHAEL KANE, CHAIR OF tion mixer. STATISTICAL GRAPHICS SECTION

If you would still like to participate, it Here, at the end of 2016, the popularity is not too late. You can submit pro- of data science continues to increase. posals for topic-contributed ses- Visualization is an essential component sions and poster sessions until Feb- to making sense of data, whether it’s to ruary first. You can also volunteer make new discoveries or communicate to organize sessions for JSM or, if findings. As a result, we’ve once again you like to be an officer, let us know seen a high level of participation and en- which positions you’d like to hold. thusiasm in our community. Thanks to the Program Chair, speakers, organiz- Finally, I’d like to thank Michael ers, and chairs of all our sessions at this Friendly, Naomi Robbins, Di Cook, year’s Joint Statistical Meetings for all of Yihui Xie, Ken Shirley, Leanna their work and for making the Graphics House, Sarah Hardy, Rebecca Nu- Section so successful. gent, Stacy Lindborg, and Rick Pe-

terson for their work as Section This year we also recognized Bill Cleve- Officers. Also, I’d like to thank Usha land, “for his substantial contributions to Govindarajulu, the Computing Sec- Statistical Computing and Graphics, tion’s Publication Officer, for the which have transformed the way statisti- work she has done putting together cians work with data.” He is a world- the newsletters. class researcher who has made fun- damental contributions to data visualiza- Michael Kane tion and computing. He was not only Section on Statistical Graphics one of the first people to use the term, Section Chair (2016) “data science” but he also was the first

2

Statistical Computing & Statistical Graphics [ASA ] Statistical Computing Program Chair Statistical Graphics Program Chair Report (Wendy Martinez): Report (Yihui Xie):

At the 2016 Joint Statistical Meetings in We had an interesting and varied JSM Chicago, the Section on Statistical 2016 program for our members. We hope Graphics (SSG) sponsored three invited that those of you who attended the meet- sessions: ings found the sessions to be interesting - Recent Advances in Information Visuali- and informative. Here is a summary of zation Statistical Computing at the JSM 2016. As - Applied Data Visualization in Industry always, thank you for supporting our pro- and Journalism fession through your contributions at the - Interactive Visualizations and Web Ap- meetings! plications for Analytics

and three contributed sessions: 12 proposals were submitted for invited

sessions. - Methods and Applications of Statistical Graphics - Toward Better Communication of Infor- 9 of those invited sessions were on the mation with Statistical Graphics program - When the Plot Is Not the End: Ad- . vances in Computing and Reasoning on 125 contributed abstracts were submitted. Data Visualizations

as well as a contributed poster session. 4 topic-contributed sessions were held, one of which honored Bill Cleveland. In addition to the SSG program, we also worked with the Section on Statistical Computing on the Statistical Computing 10 contributed sessions were organized. and Graphics Student Awards, Mixer, and the John M. Chambers Software Award. This year we had a special ses- 1 speed poster session took place. sion for the Statistical Computing and Graphics Award, received by Bill Cleve- land of Purdue University, in recognition 2 round tables were conducted. for his substantial contributions to Statis- tical Computing and Graphics, which have transformed the way statisticians work with data.

Lastly I would like to welcome the incom- ing SSG program chair, Kenny Shirley (Amazon), and we look forward to Kenny's work on the JSM 2017 program, which I believe will be fun, since we have been thinking of inviting some pioneers in

statistical graphics to share their ideas from 30-40 years ago.

.

3

Statistical Computing & Statistical Graphics [ASA ] TOPIC-CONTRIBUTED PAPER Statistical Computing Program SESSIONS Chair Elect Report (Eric Laber): These sessions consist of a col- The computing section will spon- lection of contributed talks and sor 5 invited sessions including discussions (if desired) that share sessions on networks, reproduci- a common topic. There are three bility, and large-scale Bayesian (3) format options for a Topic- computing. These sessions are Contributed paper session: going to be really exciting for 2017!! 1. Five (5) papers 2. Four (4) papers and one (1) dis- CALL FOR VOLUNTEERS TO CHAIR cussant CONTRIBUTED SESSIONS For those going to the 2017 JSM, I am 3. Three (3) papers and two (2) inviting you to serve as a contributed discussants session chair. It is a great way to net- work and to support your section. To TOPIC-CONTRIBUTED SES- volunteer, please email me by Febru- SIONS ary 1, 2017. Online submission of proposes closes January 11, 2017. Ac- CALL TO SUBMIT CONTRIBUTED cepted sessions must send ab- SESSION ABSTRACTS stracts to ASA by February 1, We encourage you to organize a topic- 2017. contributed session or to submit a con- tributed abstract for the 2017 Joint Sta- tistical Meetings (JSM). There are two types of contributed sessions: regular and topic-contributed. The submis- sion period of proposals for topic- contributed sessions (paper or panel) opened on December 1, 2016 and will close on January 11, 2017. For more information on topic-contributed sessions, refer to the website. Note that proposals for topic-contributed sessions must be submitted online by the deadline.

4

Statistical Computing & Statistical Graphics [ASA ] Statistical Computing Program Statistical Computing Secretary/ Chair Elect Report (Eric Laber) - Treasurer Report (Genevera Allen): continued: Membership in the joint Computing Key Dates for JSM 2017 (check and Graphics section and the Statisti- JSM 2017 website at for current in- cal Computing section is growing, and formation) the sections are in good financial

shape. December 1, 2016 12:01 AM - Feb- ruary 1, 2017 11:59 PM In October 2016, the Section on Sta- Online submission of abstracts (all ex- tistical Computing helped as a co- cept invited papers and panels) sponsor for the Women in Data Sci- ence and Statistical Conference held January 11, 2017 in Charlotte, North Carolina. Online submission of topic-contributed session proposals deadline Statistical Graphics Secretary/ Treasurer Report (Dianne Cook): March 30, 2017 – April 18, 2017 Online abstract editing open The Graphics section is in a healthy financial state. We would like to fund May 1, 2017 some activities to support building Registration and housing open membership. All ideas are welcome - please email your suggestions to me May 17, 2017 or to any member of the committee. Draft manuscript deadline

June 1, 2017 Early registration deadline

June 2, 2017 – June 29, 2017

5

Statistical Computing & Statistical Graphics [ASA ] Statistical Computing Awards Chair (Patrick Breheny): The deadline for applications for both The Statistical Computing and Statisti- competitions was December 15, cal Graphics Sections of the ASA 2016, with judging to be completed are co-sponsoring two competitions and winners chosen by January 15, this year: the student paper competi- 2017. Winners for both competitions tion and the John M. Chambers Statis- will be given their awards and present tical Software Award. their contributions in a special topic- contributed session at the 2017 Joint The student paper competition, which Statistical Meetings. rewards original work in the area of methodological research or some other novel computing or graphical ap- plication in statistics, carries with it a . cash award of $1,000. The competi- tion is open to anyone who is a stu- dent (graduate or undergraduate) on or after September 1, 2016. This year we anticipate awarding four prizes.

The Chambers Award, in contrast to the student paper competition, is an award for the development of statisti- cal software -- nominations must in- clude the software, and judges try out the software to determine how useful it will prove to the statistical and scien- tific communities. Previous winners in- clude Deepayan Sarkar (lattice), Had- ley Wickham (ggplot), and Michael Kane (bigmemory). The award carries with it a cash prize of $1,000, and is open to anyone who is either currently a student or completed her/his last de- gree after January 1, 2016.

6

Statistical Computing & Statistical Graphics [ASA ] Statistical Computing Council of  Computing section sponsorship of Sections representatives’ report Women in Statistics and Data Sci- (Mine Cetinkaya-Rundel, Jonathan ence conference in an effort to in- Lane, John Monahan): crease collaboration as well as The focus of the discussion at the non-JSM activities. Sponsorship Council of Sections annual business was used towards student travel meetings at the Joint Statistics Meet- awards. ings in August was increasing and bet- ter communicating the value of section  Working with ASA to centralize the membership. It was reported that only storage of datasets used in Data 33.7% of members belong to more Expo, will be in place by 2018 than one section and that for many Data Expo. ASA members the benefits of belong- ing to a section were not obvious.

In an effort to increase awareness on section benefits the COSGB has col- lected bulletpoints from each section that summarize section activity and benefits. Sections are encouraged to feature this information on their web- sites as well as increase their non- JSM presence. Another topic of dis- cussion at the meeting was generating more interest in interest groups and making them more visible to ASA members.

Other items to note:

7

Statistical Computing & Statistical Graphics [ASA ] R Consortium News from RStudio: of mini-conferences and more. Please 2016 was a year of growth and expansion note, that current call for proposals by for the R language. Although firmly an- the R Consortium is open until February chored in the culture and practices of sta- 10, 2017. tistical computing, R has become an im- portant platform for the emerging disci- The r-project itself got a new facelift in pline of data science. In July, the IEEE 2016. As of release 3.3.0 R’s icons are Spectrum Magazine reported that its an- based on the new flat logo. nual survey of programming languages ranked R in 5th place. R now competes with Python as a general purpose lan- guage for data science. The popularity of R is also expanding R’s user-base to in- clude analysts not originally trained as statisticians. These new users are gener- ally more familiar with SQL style program- ming, and appear to be fueling the grow- The collection of R packages continued ing interest in the “” collection of to expand in 2016 with over 2,000 new R packages which emphasizes a consis- packages submitted to CRAN: many of tent vocabulary of SQL-like programming these are concerned directly with data constructs for data import, manipulation science, machine learning and with in- and visualization. creasing R’s visualization capabilities. trelliscope which addresses exploring Corporate support for the R also became and visualizing huge data sets, and more apparent in 2016 as Microsoft re- sparklyr which lets R users access data leased a new version of SQL Server with and build models on Spark/Hadoop clus- in-database support for proprietary R ters are but two notable examples of R’s functions that can be accessed from open success in big data computing. The con- -source R environments. Corporate sup- tinued development of rmarkdown and port also enabled the growth of the R the many new packages from ROpenSci Consortium, a nonprofit organization or- illustrate the R Community’s commitment ganized under the Linux Foundation with to reproducible research. We expect the mission to provide infrastructure sup- these trends to continue in 2017. Look port for the R language, the R Foundation for CRAN to exceed 10,000 packages in and the R Community. The R Consortium January.. has provides substantial funding to sev- eral projects including the R-Hub build Joseph Rickert system, the worldwide association of R- RStudio R Community Ambassador Ladies useR groups, the satRdays series R Consortium Director

8

Statistical Computing & Statistical Graphics [ASA ] Computing News from SAS: a new framework called the Help In November 2016, SAS introduced the 14.2 release of analytical products. High- Center. This documentation contains lights of SAS/STAT® 14.2 include the fol- lowing: links to sample programs and to

The new PSMATCH procedure pro- some of the videos that describe fea- vides propensity score analysis. The new CAUSALTRT procedure esti- tures of SAS. mates causal treatment effects. The PHREG procedure now provides time-dependent ROC curves for Cox SAS continues to offer free “How To” regression. See Figure 1 for an ex- videos and training courses, including ample. The NLIN procedure now provides ES- a free course titled “SAS Program- TIMATE and CONTRAST state- ments. ming for R Users.” For students, The SURVEYIMPUTE procedure pro- teachers, and lifelong learners, SAS vides two-stage fully efficient frac- tional imputation and fractional hot- has enhanced the free SAS® Univer- deck imputation. The FREQ and SURVEYFREQ proce- sity Edition, which is available for dures provide additional agreement statistics. Windows, Mac, and Linux systems. In SAS/ETS® 14.2, the SPATIALREG procedure analyzes spatial econo- To stay up to date with SAS activities metric models for cross-sectional data. SAS/IML® 14.2 has enhanced throughout the year, subscribe to the support for in-memory data tables bimonthly SAS Statistics and Opera- and lists. For other products, see highlights of the 14.2 releases. tions Research Newsletter.

The online documentation for the 14.2 re- . lease of analytical products now resides in

9

Statistical Computing & Statistical Graphics [ASA ]

Computing News from SAS (continued):

Figure 1: Nonparametric Negative Binomial Model for Count Data Fit by Using PROC GAMPL

Rick Wicklin, SAS

Notes and Acknowledgements:

The newsletter was put together by Section on Statistical Computing Publications Officer, Usha Govindarajulu, with assistance and contribution from the Chairs of both sections. Thank you to all the Chairs and Officers of both sections for their con- tribution to this newsletter via officer reports, news, and announcements.

If you would like to make future contributions to our newsletter or have any ques- tions, please free to contact the publication officers or any other section officer listed on the last page of this newsletter. Thank you.

10

Statistical Computing & Statistical Graphics [ASA ]

ANNOUNCEMENTS: The dataset for the GSS Data Chal- lenge 2017 will be the Consumer Ex- Data Challenge 2017 penditure Survey (CE). Public Use Three ASA sections (Computing, Gov- data files and documentation (file ernment, and Graphics) are proud to structure, data dictionary, sample sponsor the Data Challenge 2017 to code, etc.), are available here: . Con- take place at the JSM 2017 meetings. testants must use some portion of the The contest is open to anyone who is CE data and can also combine other interested in participating, including data sources in the analysis. college students and professionals from the private or public sector. This contest challenges participants to ana- Standard tables, showing expendi- lyze a government dataset using sta- tistical and visualization tools and tures and related information for vari- methods. There will be two award ous demographic groups, are avail- categories – Professional (one level) able here: . An experimental table and Student (three levels). Award showing extremely detailed average amounts will be announced later. annual expenditures and other infor- Contestants will present their results in mation for all consumer units (similar a speed poster session at the JSM to a household or family) in the U.S. and must submit their abstracts to the is available here: . For more informa- JSM online system in the usual man- tion, see the CE homepage ner. Presenters are responsible for their own JSM registration and travel costs, and any other costs associated Examples of research using the CE with JSM attendance. Group submis- data are also available. sions are acceptable. To enter, con- testants must do the following by Feb- Monthly Labor Review articles: ruary 1, 2017. Submit abstract for Speed Poster session to the JSM "Beyond the Numbers" series: 2017 website (Specify the Government Statistics Sec- tion (GSS) as the main "Spotlight on Statistics" and other se- sponsor. Abstraction sub- ries mission starts December 1, ; 2016.

Forward the JSM abstract to Wendy Martinez.

11

Statistical Computing & Statistical Graphics [ASA ]

Statistical Computing Section Officers 2016 Statistical Graphics Section Officers 2016

David Poole, Chair Michael Kane, Chair [email protected] [email protected] (973) 360-7337 (203) 737-4768

Catherine Calder, Chair-Elect Michael Friendly, Chair-Elect [email protected] [email protected] (614) 688-0004 (416) 736-2100 x66249

Genevera Allen, Secretary / Treasurer Dianne Cook, Secretary / Treasurer [email protected] [email protected] (832) 378-8032 +61 (0) 3990 52608

Eric Laber, Program Chair-Elect 2016-2016 Kenneth Edward Shirley, Program Chair –Elect 2016-2016 [email protected] [email protected] (919) 513-7675 (615) 875-3397

Wendy L. Martinez, Program Chair 2015-2016 Yihui Xie, Program Chair 2016-2016 [email protected] [email protected] (202) 691-7400 (859) 230-4326

Mine Cetinkaya-Rundel, COS Representative 2016- Rebecca Nugent, COS Representative 2014-2016 2018 [email protected] [email protected] (412) 268-7830 (919) 684-5956

Jonathan Wesley Lane, COS Representative 2016-2017 Sarah Hardy, COS Representative 2016-2018 [email protected] [email protected] (715) 252-2103 (207) 778-7124

John Francis Monahan, COS Representative 2014-2016 Leanna House , Publications Officer 2016-2017 [email protected] [email protected] (919) 515-0622 (540) 231-2256

Usha Govindarajulu, Publications Officer 2014-2016 Patrick Breheny, Awards Chair (see left) [email protected] (718) 221-6599 Stacy Lindborg, COS Gov Board 2014-2016 [email protected] Patrick Breheny, Awards Chair (617) 914-1979 [email protected] (319) 384-1584 Rick Peterson, ASA Staff Liaison email{[email protected]} Stacy Lindborg, COS Gov Board 2014-2016 (703) 684-1221 (see right)

Rick Peterson, ASA Staff Liaison (see right) All communications regarding ASA membership and the Sta- tistical Computing and Statistical Graphics Sections, The Statistical Computing & Statistical Graphics News- including change of address, should be sent to letter is a publication of the Statistical Computing and Statistical Graphics Sections of the ASA. American Statistical Association 1429 Duke Street Alexandria, VA 22314-3402 USA TEL (703) 684-1221 FAX (703) 684-2036 [email protected]

12