Advanced Analytics Tools Help Users Find Meaning In

Total Page:16

File Type:pdf, Size:1020Kb

Advanced Analytics Tools Help Users Find Meaning In E-Book NEW TOOLS, TECHNOLOGIES HELP BOOST BIG DATA INITIATIVES September 2016 ADVANCED ANALYTICS TOOLS HELP USERS FIND MEANING IN BIG DATA Predictive modeling, machine learning and other analytics applications hold the key to unlocking the business value of big data. But it takes a lot of tools, and effort, to derive the benefits. BY CRAIG STEDMAN 1. A Better Way to Analyze Data 5. Sold on Improving Access to Data 4. Hurdles Along the Big Data Path 2. Big Data Drives Business Benefits 3. Going Deeper on Data Analytics NEW TOOLS, TECHNOLOGIES HELP BOOST BIG DATA INITIATIVES Home A Better Way to Analyze Data Big Data Drives Business Benefits Going Deeper on Data Analytics Hurdles Along the Big Data Path EFORE IT DEPLOYED shackled. They couldn’t do their jobs.” a Hadoop cluster five The Hadoop system has alleviated that situation, pro- Sold on Improving Access to Data years ago, retailer viding a single architecture that supports basic business Macy’s Inc. had big intelligence and reporting processes along with advanced problems analyzing all analytics applications, Chakrapany said. The cluster “could of the sales and marketing data its systems were generat- truly be an enterprise data analytics platform” for Macy’s, ing—and the problems were only getting bigger as Macy’s he added. Along with analytics teams using the cluster, pushed aggressively to increase its online business, further thousands of business users in marketing, merchandising, ratcheting up the data volumes it was looking to explore. product management and other departments are already BThe company’s traditional data warehouse architecture accessing hundreds of BI dashboards fed by the system. had severe processing limitations and couldn’t handle un- But there’s a lot more to the big data environment than structured information such as text; historical data also the Hadoop cluster alone. At the front end, for example, was largely inaccessible, typically having been archived on Macy’s has deployed a variety of analytics tools to meet dif- tapes that were shipped to off-site storage facilities. Data ferent application needs. For statistical analysis, the Cincin- scientists and other analysts “could only run so many que- nati-based retailer uses SAS and Microsoft’s R Server, which ries at particular times of the day,” said Seetha Chakrapany, is based on the R open source statistical programming lan- director of marketing analytics and customer relationship guage and was originally developed by Revolution Analytics, management systems at Macy’s. “They were pretty much a software vendor that Microsoft acquired in April 2015. 2 ADVANCED ANALYTICS TOOLS HELP USERS FIND MEANING IN BIG DATA NEW TOOLS, TECHNOLOGIES HELP BOOST BIG DATA INITIATIVES Home A Better Way to Analyze Data Several tools provide predictive analytics, data mining and Predicting High Growth machine learning capabilities, including H2O; Salford Pre- The use of predictive analytics software and other advanced analytics dictive Modeler; the Mahout open source machine learning tools and techniques by surveyed organizations is expected to increase Big Data Drives sharply over the next three years. Business Benefits platform; and KXEN, an analytics technology bought by SAP three years ago. Also in the picture are Tableau’s data IN USE NOW n PLANNING TO USE WITHIN THREE YEARS Going Deeper on visualization tools and Atscale’s BI on Hadoop technology. Data Analytics The different analytics tools are key elements in mak- PREDICTIVE ANALYTICS ing effective use of the big data architecture, Chakrapany 59% Hurdles Along said in a presentation and follow-up interview at Hadoop the Big Data Path 28% Summit 2016 in San Jose, Calif. Automating the advanced analytics process through statistical routines and machine WHAT-IF SIMULATIONS Sold on Improving Access to Data learning is a must, he noted. “We’re constantly in a state of 57% experimentation. And because of the volume of data, there’s 22% just no humanly possible way to analyze it [manually]. So we apply all the statistical algorithms to help us see what’s PRESCRIPTIVE ANALYTICS happening with the business.” That includes analysis of cus- 51% tomer, order, product and marketing data, plus clickstream 18% activity records captured from the Macys.com website. SOCIAL MEDIA ANALYTICS Similar scenarios are increasingly playing out at other 46% organizations, too. As big data platforms such as Hadoop, NoSQL databases and the Spark processing engine become 17% more widely adopted, the number of companies deploying TEXT ANALYTICS advanced analytics tools that can help them take advantage 45% of the data flowing into those systems is also on the rise. 15% In an ongoing survey on the use of BI and analytics software conducted by TechTarget, 26.2% of some 4,000 SOURCE: TDWI’S “OPERATIONALIZING AND EMBEDDING ANALYTICS FOR ACTION”; BASED respondents as of late August said their organizations had ON RESPONSES FROM 309 BI, ANALYTICS AND DATA MANAGEMENT PROFESSIONALS. 3 ADVANCED ANALYTICS TOOLS HELP USERS FIND MEANING IN BIG DATA NEW TOOLS, TECHNOLOGIES HELP BOOST BIG DATA INITIATIVES Home A Better Way to Analyze Data installed predictive analytics tools. And looking forward, analytics—and the tools that make it possible—is a tangible predictive analytics topped the list of technologies for thing for Progressive and its auto policy customers, said planned investments over the next 12 months. It was cited Brian Durkin, an innovation strategist in the company’s en- Big Data Drives Business Benefits by 38.3% of the respondents, putting it above more main- terprise architecture group. stream business intelligence technologies such as data visu- The Mayfield Village, Ohio, insurer uses a Hadoop clus- Going Deeper on alization, self-service BI and enterprise reporting. ter partly to power its Snapshot program, which awards Data Analytics A TDWI survey conducted in the second half of 2015 also policy discounts to safe drivers based on operational data found increasing plans to use predictive analytics software collected from their vehicles through a device that plugs Hurdles Along to bolster business operations. In that case, 87% of 309 BI, into the diagnostic port. Progressive has handed out more the Big Data Path analytics and data management professionals said their than $560 million worth of discounts since launching the organizations were already active users of the technology or program in 2008, Durkin said. “It’s not some little science Sold on Improving Access to Data expected to implement it within three years. Other forms of experiment that we’re running,” he noted. “We’re fully in- advanced analytics—what-if simulations and prescriptive vested in it, and it means a lot to our customers.” analytics, for example—are similarly in line for increased To track participating drivers and calculate discounts, usage, according to a report on the survey that was pub- huge volumes of data get processed and analyzed in the lished last December (see “Predicting High Growth”). cluster, which, like the Macy’s cluster, is based on the Hor- Machine learning tools and other types of artificial intel- tonworks distribution of Hadoop. Thus far, Progressive ligence technologies—deep learning and cognitive comput- has collected data on 2.4 billion driving trips by customers, ing among them—are also getting increased attention from and the company retains all the information. One of the technology users and vendors alike, as analytics teams look goals of the program is to identify bad habits that people to automated algorithms to help make sense of data sets can be alerted to—for example, hard braking, which “is that are getting larger and larger. very predictive of bad driving behavior,” Durkin said. For that kind of analysis, he added, “it’s the older data that’s more valuable. So we have to keep everything and analyze BIG DATA DRIVES BUSINESS BENEFITS everything.” Progressive Casualty Insurance Co. is another com- Crunching the data requires plenty of processing pany that’s already there. The business value of big data (Continued on page 6) 4 ADVANCED ANALYTICS TOOLS HELP USERS FIND MEANING IN BIG DATA NEW TOOLS, TECHNOLOGIES HELP BOOST BIG DATA INITIATIVES Home A Better Way to Analyze Data Big Data Drives Data Scientists Hard to Find—and Hold Onto Business Benefits ONE OF THE biggest things that can put stream enterprises, getting tangible the work is that they’re going to do— a damper on big data analytics pro- business benefits through effective how much money they’re going to put Going Deeper on Data Analytics grams has nothing to do with deploy- analytics is becoming a higher-profile in the pockets of people at tax time,” ing and managing advanced analytics priority. “The technology does what Loconzolo said. “That’s very attractive tools—it’s the challenge of hiring and it says it will,” Gartner analyst Merv to data scientists. They want to solve Hurdles Along the Big Data Path retaining skilled data scientists who Adrian said at the 2016 Pacific North- real problems.” can put the tools you’ve installed to west BI Summit in Grants Pass, Ore. good use. But, he added, integration problems AN OFFER THEY CAN’T REFUSE Sold on Improving Access to Data In a survey of business intelligence, and inadequate skills on both the data But even if you do find the data sci- analytics and data management pro- management and analytics sides often entists you need, it may be hard to fessionals conducted by TDWI in the hold companies back from making keep them from jumping ship when second half of 2015, a lack of skilled good on their investments. other companies come calling with personnel ranked second on a list of Bill Loconzolo, vice president of job offers. “Data scientists in my part top challenges that organizations face data engineering and analytics at of the world are changing jobs every in trying to embed analytics processes finance and accounting software three months,” said Mike Ferguson, into their business operations.
Recommended publications
  • Revolution R Enterprise 6.1 README
    Revolution R Enterprise 6.1 README Revolution R Enterprise 6.1 for 32-bit and 64-bit Windows and 64-bit Red Hat Enterprise Linux (RHEL 5.x and RHEL 6.x) features an updated release of the RevoScaleR package that provides fast, scalable data management and data analysis: the same code scales from data frames to local, high-performance .xdf files to data distributed across a Windows HPC Server cluster, Windows HPC Server Azure Burst cluster, or IBM Platform Computing LSF cluster. RevoScaleR also allows distribution of the execution of essentially any R function across cores and nodes, delivering the results back to the user. Installation instructions and instructions for getting started are provided in your confirmation e-mail. What’s New in Revolution R Enterprise 6.1 Big Data Decision Tree Models New RevoScaleR function rxDTree can be used to create decision tree models. It is based on a binning algorithm so that it can scale to huge data. Both classification and regression trees are supported. The model objects returned can be made to inherit from the rpart class of the rpart package, so that plot.rpart, text.rpart, and printcp can be used for subsequent analysis. Prediction for models fitted by rxDTree can be done using rxPredict. See Chapter 10 of the RevoScaleR User’s Guide for examples on how to create decision tree models with rxDTree. Additional information is available in the rxDTree help file, seen by entering ?rxDTree at the R command line. Support for Compression in .xdf Files RevoScaleR’s .xdf files can now be created using zlib compression.
    [Show full text]
  • Frequently Asked Questions About Rcpp
    Frequently Asked Questions about Rcpp Dirk Eddelbuettel Romain François Rcpp version 0.12.7 as of September 4, 2016 Abstract This document attempts to answer the most Frequently Asked Questions (FAQ) regarding the Rcpp (Eddelbuettel, François, Allaire, Ushey, Kou, Chambers, and Bates, 2016a; Eddelbuettel and François, 2011; Eddelbuettel, 2013) package. Contents 1 Getting started 2 1.1 How do I get started ?.....................................................2 1.2 What do I need ?........................................................2 1.3 What compiler can I use ?...................................................3 1.4 What other packages are useful ?..............................................3 1.5 What licenses can I choose for my code?..........................................3 2 Compiling and Linking 4 2.1 How do I use Rcpp in my package ?............................................4 2.2 How do I quickly prototype my code?............................................4 2.2.1 Using inline.......................................................4 2.2.2 Using Rcpp Attributes.................................................4 2.3 How do I convert my prototyped code to a package ?..................................5 2.4 How do I quickly prototype my code in a package?...................................5 2.5 But I want to compile my code with R CMD SHLIB !...................................5 2.6 But R CMD SHLIB still does not work !...........................................6 2.7 What about LinkingTo ?...................................................6
    [Show full text]
  • Revolution R Enterprise™ 7.1 Getting Started Guide
    Revolution R Enterprise™ 7.1 Getting Started Guide The correct bibliographic citation for this manual is as follows: Revolution Analytics, Inc. 2014. Revolution R Enterprise 7.1 Getting Started Guide. Revolution Analytics, Inc., Mountain View, CA. Revolution R Enterprise 7.1 Getting Started Guide Copyright © 2014 Revolution Analytics, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of Revolution Analytics. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of The Rights in Technical Data and Computer Software clause at 52.227-7013. Revolution R, Revolution R Enterprise, RPE, RevoScaleR, RevoDeployR, RevoTreeView, and Revolution Analytics are trademarks of Revolution Analytics. Other product names mentioned herein are used for identification purposes only and may be trademarks of their respective owners. Revolution Analytics. 2570 W. El Camino Real Suite 222 Mountain View, CA 94040 USA. Revised on March 3, 2014 We want our documentation to be useful, and we want it to address your needs. If you have comments on this or any Revolution document, send e-mail to [email protected]. We’d love to hear from you. Contents Chapter 1. What Is Revolution R Enterprise? ....................................................................
    [Show full text]
  • Iotools: High-Performance I/O Tools for R by Taylor Arnold, Michael J
    CONTRIBUTED RESEARCH ARTICLE 6 iotools: High-Performance I/O Tools for R by Taylor Arnold, Michael J. Kane, and Simon Urbanek Abstract The iotools package provides a set of tools for input and output intensive data processing in R. The functions chunk.apply and read.chunk are supplied to allow for iteratively loading contiguous blocks of data into memory as raw vectors. These raw vectors can then be efficiently converted into matrices and data frames with the iotools functions mstrsplit and dstrsplit. These functions minimize copying of data and avoid the use of intermediate strings in order to drastically improve performance. Finally, we also provide read.csv.raw to allow users to read an entire dataset into memory with the same efficient parsing code. In this paper, we present these functions through a set of examples with an emphasis on the flexibility provided by chunk-wise operations. We provide benchmarks comparing the speed of read.csv.raw to data loading functions provided in base R and other contributed packages. Introduction When processing large datasets, specifically those too large to fit into memory, the performance bottleneck is often getting data from the hard-drive into the format required by the programming environment. The associated latency comes from a combination of two sources. First, there is hardware latency from moving data from the hard-drive to RAM. This is especially the case with “spinning” disk drives, which can have throughput speeds several orders of magnitude less than those of RAM. Hardware approaches for addressing latency have been an active area of research and development since hard-drives have existed.
    [Show full text]
  • Integrating R with Azure for High-Throughput Analysis Hugh Analysis Shanahan
    Integrating R with Azure for High- throughput Integrating R with Azure for High-throughput analysis Hugh analysis Shanahan Hugh Shanahan Department of Computer Science Royal Holloway, University of London [email protected] @HughShanahan Hugh Shanahan Integrating R with Azure for High-throughput analysis Applicability to other domains Integrating R with Azure for High- throughput analysis This project started out doing something very specific Hugh for the domain I work in (Computational Biology). Shanahan I promise that there will be no Biology in this talk !! Realised can be extended to running high-throughput jobs in R. Contrast with MapReduce / R formalisms (HadoopStreaming, Rhipe, Revolution Analytics, ... ) - parallelisation happens outside of individual R script. Hugh Shanahan Integrating R with Azure for High-throughput analysis Applicability to other domains Integrating R with Azure for High- throughput analysis This project started out doing something very specific Hugh for the domain I work in (Computational Biology). Shanahan I promise that there will be no Biology in this talk !! Realised can be extended to running high-throughput jobs in R. Contrast with MapReduce / R formalisms (HadoopStreaming, Rhipe, Revolution Analytics, ... ) - parallelisation happens outside of individual R script. Hugh Shanahan Integrating R with Azure for High-throughput analysis Applicability to other domains Integrating R with Azure for High- throughput analysis This project started out doing something very specific Hugh for the domain I work in (Computational Biology). Shanahan I promise that there will be no Biology in this talk !! Realised can be extended to running high-throughput jobs in R. Contrast with MapReduce / R formalisms (HadoopStreaming, Rhipe, Revolution Analytics, ..
    [Show full text]
  • BIG DATA 50 the Hottest Big Data Startups of 2014
    BIG DATA 50 The hottest big data startups of 2014 Jeff Vance Table of Contents Big Data Startup Landscape – Overview .................................................................................. i About the Author ................................................................................................................. iii Introduction – the Big Data Boom .......................................................................................... 1 Notes on Methodology & the origin of the Big Data 50 ............................................................. 2 The Big Data 50 ..................................................................................................................... 5 Poised for Explosive Growth ....................................................................................................... 5 Entrigna ................................................................................................................................... 5 Nuevora ................................................................................................................................... 7 Roambi .................................................................................................................................... 9 Machine Learning Mavens ........................................................................................................ 10 Oxdata ................................................................................................................................... 10 Ayasdi ...................................................................................................................................
    [Show full text]
  • REVOLUTION ANALYTICS = ACTUARIAL EYE (Part 1)
    REVOLUTION ANALYTICS = ACTUARIAL EYE (Part 1) “Perhaps the most important cultural trend today: The explosion of data about every aspect of our world and the rise of applied math gurus who know how to use it.” – Chris Anderson, editor-in-chief of Wired. “There is a real appetite in the business to understand more and to think about how we assemble and use data, also making sure we have the right people to ask the right questions of the data – because one without the other is not helpful.” - Wendy Thorpe, AMP Why Analytics? Organizations are built on great decisions and great decisions are built on great predictions. So what are great predictions built on? The answer is ANALYTICS!!! So in case you are wondering what this could possibly mean for you as a prospective actuary, the answer depends on what kind of actuary you want to be. Let me quote Duncan West here: If you are in a leadership role in your organization, get data onto the strategic agenda. Many companies talk about the importance of data but talk is not cheap. Do they manage themselves in ways that demonstrates the importance? Actuaries in any role need to go away from a regulatory and compliance mindset that accuracy is the most important way to measure success. They should measure success by helping the business to make good decisions. And actuaries at all levels need to help develop the skills necessary to show insights to the business. Insights are useless if the business can’t understand them. So communicating insight is a key part of an actuarial role.
    [Show full text]
  • Book of Abstracts
    Book of Abstracts June 27, 2015 1 Conference Sponsors Diamond Sponsor Platinum Sponsors Gold Sponsors Silver Sponsors Open Analytics Bronze Sponsors Media Sponsors 2 Conference program Time Tuesday Wednesday Thursday Friday 08:00 Registration opens Registration opens Registration opens Registration opens 08:30 – 09:00 Opening session (by Rector peR! M. Johansen, Aalborg University) Aalborghallen 09:00 – 10:00 Romain François Di Cook Thomas Lumley Aalborghallen Aalborghallen Aalborghallen 10:00 – 10:30 Coffee break Coffee break Coffee break (15 min) ee break Sponsored by Quantide Sponsored by Alteryx ff Session 1 Session 4 10:30 – 12:00 Sponsor session (10:15) Kaleidoscope 1 Kaleidoscope 4 Aalborghallen Aalborghallen Aalborghallen incl. co Morning Tutorials DataRobot Ecology Medicine Gæstesalen Gæstesalen RStudio Teradata Networks Regression Musiksalen Musiksalen Revolution Analytics Reproducibility Commercial Offerings alteryx Det Lille Teater Det Lille Teater TIBCO H O Interfacing Interactive graphics 2 Radiosalen Radiosalen HP 12:00 – 13:00 Sandwiches Lunch (standing buffet) Lunch (standing buffet) Break: 12:00 – 12:30 Sponsored by Sponsored by TIBCO ff Revolution Analytics Ste en Lauritzen (12:30) Aalborghallen Session 2 Session 5 13:00 – 14:30 13:30: Closing remarks Kaleidoscope 2 Kaleidoscope 5 Aalborghallen Aalborghallen 13:45: Grab ’n go lunch 14:00: Conference ends Case study Teaching 1 Gæstesalen Gæstesalen Clustering Statistical Methodology 1 Musiksalen Musiksalen ee break Data Management Machine Learning 1 ff Det Lille Teater Det Lille
    [Show full text]
  • Mergers in the Digital Economy
    2020/01 DP Axel Gautier and Joe Lamesch Mergers in the digital economy CORE Voie du Roman Pays 34, L1.03.01 B-1348 Louvain-la-Neuve Tel (32 10) 47 43 04 Email: [email protected] https://uclouvain.be/en/research-institutes/ lidam/core/discussion-papers.html Mergers in the Digital Economy∗ Axel Gautier y& Joe Lamesch z January 13, 2020 Abstract Over the period 2015-2017, the five giant technologically leading firms, Google, Amazon, Facebook, Amazon and Microsoft (GAFAM) acquired 175 companies, from small start-ups to billion dollar deals. By investigating this intense M&A, this paper ambitions a better understanding of the Big Five's strategies. To do so, we identify 6 different user groups gravitating around these multi-sided companies along with each company's most important market segments. We then track their mergers and acquisitions and match them with the segments. This exercise shows that these five firms use M&A activity mostly to strengthen their core market segments but rarely to expand their activities into new ones. Furthermore, most of the acquired products are shut down post acquisition, which suggests that GAFAM mainly acquire firm’s assets (functionality, technology, talent or IP) to integrate them in their ecosystem rather than the products and users themselves. For these tech giants, therefore, acquisition appears to be a substitute for in-house R&D. Finally, from our check for possible "killer acquisitions", it appears that just a single one in our sample could potentially be qualified as such. Keywords: Mergers, GAFAM, platform, digital markets, competition policy, killer acquisition JEL Codes: D43, K21, L40, L86, G34 ∗The authors would like to thank M.
    [Show full text]
  • Cortana Analytics in Banking and Capital Markets: Delivering ROI on Big Data
    Cortana Analytics in Banking and Capital Markets: Delivering ROI on Big Data Executive Summary Summaryew industries can derive more benefit from big data and advanced analytics than the financial services industry. Nearly every FSI F transaction is executed electronically—and the amount of data generated by the industry is staggering. Recent technological innovations in cloud computing, big data and advanced analytics are enabling FSIs to transform the vast stores of data at their disposal into game-changing insights across a broader set of applications, that include helping to increase sales; improve customer service; and reduce risk, fraud and customer churn. Leading financial institutions are partnering with Microsoft to make the most of their data, leveraging the Microsoft Analytics platform with Cortana Analytics, a fully managed big data and advanced analytics suite. This powerful platform, combined with Revolution Analytics (creator of applications for R, the world’s most widely used programming language for statistical computing and predictive analytics), provides customers with the flexibility of an end-to-end suite across on-premises and cloud-deployment models. Industry Trends The financial services industry is facing a number of challenges and opportunities in a rapidly changing marketplace. Customers are in the driver’s seat. Empowered customers are clearly a part of today’s industry landscape. Retail and institutional customers are: More informed than ever Increasingly mobile Expecting consistent service across channels Not as trusting as they once were As a result, there is increased pressure for new customer engagement models, greater transparency and the ability to demonstrate business integrity. Delivering personalized, contextual and connected experiences across channels is critical.
    [Show full text]
  • Breaking Data Science Open How Open Data Science Is Eating the World
    Breaking Data Science Open How Open Data Science Is Eating the World Michele Chambers, Christine Doig, and Ian Stokes-Rees Beijing Boston Farnham Sebastopol Tokyo Breaking Data Science Open by Michele Chambers, Christine Doig, and Ian Stokes-Rees Copyright © 2017 O’Reilly Media, Inc. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Editor: Tim McGovern Interior Designer: David Futato Production Editor: Nicholas Adams Cover Designer: Randy Comer Proofreader: Rachel Monaghan February 2017: First Edition Revision History for the First Edition 2017-02-15: First Release The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Breaking Data Science Open, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is sub‐ ject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
    [Show full text]
  • Introduction to R Background Installation Basics
    Intro.1 Intro.2 Introduction to R • 2001: I first hear about R early this year! • 2004: The first UseR! conference is held, and the non-profit Much of the content here is from Appendix A of my Analy- R Foundation is formed. The conference is now held annually sis of Categorical Data with R book (www.chrisbilder.com/ with its location alternating between the US and Europe each categorical). All R code is available in AppendixInitialExam- year. ples.R from my course website. • 2004: During a Joint Statistical Meetings (JSM) session that I attended, a SPSS executive says his company and other sta- Background tistical software companies have felt R’s impact and they are changing their business model. R is a statistical software package that shares many similari- ties with the statistical programming language named S. A pre- • 2004: Version 2.0.0 was released. liminary version of S was created by Bell Labs in the 1970s • 2007: Revolution Analytics was founded to sell a version of R that was meant to be a programming language like C but for that focuses on big data and parallel processing applications; statistics. John Chambers was one of the primary inventors the company was purchased by Microsoft in 2015. for the language, and he won the Association for Computing Journal of the American Statistical Machinery Award in 1999 for it. A nice video interview with • 2008: The editor for the Association John Chambers about the early days of S is available at https: says during a JSM presentation that R has be- //www.youtube.com/watch?v=jk9S3RTAl38.
    [Show full text]