R / Python (Slides)

R / Python (Slides)

R / Python Why and How to Get Started What do you use? • Use SPSS, Stata, or SAS • with the GUI/menus • with syntax • Use Excel • for data management • for data analysis • Use Matlab, R, or Python Python • General programming language • Create applications, run websites, interface with systems • Has all the elements of other languages • Created by groups of computer scientists • Runs fast and stable for production workflows • Simplest of languages, one best way to do any action R • Statistical Language • Built to do math and work with datasets • Can utilize some tools from other languages • Created by statisticians • Fast and intuitive to do analysis, slower to process • Many statisticians have increased it's capabilities Both • Use Scripting • The code/syntax is intended to be saved in a script file • The code can be re-played to reproduce the output • Open Source & Extensible • Anybody can create new add-ins ("packages") • People can NOT change the original without permission • Free to use As originally built, you: Type instructions at a prompt… Python Shell R Console …and get output like this R - Regression Python - Frequency Table Spyder (Python) IDEs Script • Script Window • Console Console • Help • History RStudio (R) • Files • Plots Script • Environment / Variable explorer • Run current selection Console So, why all the buzz? • Free software that can do everything SPSS, Stata, SAS, and Excel can do. • Massive improvements in ease of use through packages with convenience functions. What to Install • R • Install R from CRAN • Install RStudio • Python • Install Anaconda All of these are cross-platform with regular installers No Installation Needed • RStudio Cloud (beta) • https://rstudio.cloud/ • Python Anywhere • https://www.pythonanywhere.com/ • Both require free accounts. Old and New • R is an implementation of S (created in 1976) and was first released in 1995 with v1.0 in 2000. Hadley Wickham's dplyr package was introduced in 2014. The RStudio IDE was released in 2011 with v1.0 in 2016. • Python was created in 1990. The interactive shell was released in 2001. The data management package Pandas was first released by Wes McKinney in 2008; v1.0 was released in 2020. The Spyder IDE was released in 2009. Misconceptions • Python is better than R • Python can do a wider variety of computer tasks than R. Python has Breadth, R has Depth in Data/Statistics • The languages themselves are not what people are judging, they are judging the entire ecosystem. • Python is easier than R • Python is the simplest of the programming languages. R is not a programming language. • Since R was made by Statisticians, it does some things different than other general programming languages. Demonstration • Python • R Functions & Packages The building blocks of computer languages Have you used Functions? word(stuff) Have you used Functions? word(stuff) =AVERAGE(V1:V5) COMPUTE average = MEAN(v1, v2, v3, v4, v5). egen average = rowmean(v1 v2 v3 v4 v5) average = mean(of v1-v5); Creating a Scale Index ncc_score = (ncc1 + ncc2 + ncc3 + ncc4 + ncc5) / 5 ncc_score = SUM(ncc1, ncc2, ncc3, ncc4, ncc5) / 5 ncc_score = MEAN(ncc1, ncc2, ncc3, ncc4, ncc5) ncc_score = SCALE(ncc) "Convenience Function" ncc_score = SCALE(ncc, "sum") Added "Argument" Function Names and Arguments Functions & Objects Packages • Package: A group of functions installed together • Packages may have functions with the same name! • Install: Copy instructions to your computer • Load/Attach/Import: Put instructions in memory Media Literacy -> Package Literacy • Who wrote it? • How long has it been around? • How many other people use it? • Where is the code? • How good is the documentation? • What kind of testing has been done? • Does it give the same results? • What do other people say about it? Is R / Python for You Check in with yourself: • Do functions and arguments make sense? • Can you be detail oriented? • Can you keep track of things that change? • Are you good at thinking systematically? It's okay if you answered "no". You can still use R. If Not Yet • Practice with functions in software you know. • Use Jamovi (R) and have it show the syntax. • Practice reading syntax and identifying functions and objects. Which to Pick • Start with whichever one… • … the people around you use • … has the functions you need • … looks easier to read for you • Use R if you mostly work with data tables and do statistics • Tends to get new statistical procedures first • Easier to read and understand • Use Python if you often do non-statistical programming • More and better non-tabular text-processing tools • Better integrates with applications R + Python • Use R in Python (r2py) • Use Python in R (reticulate) • Use R or Python in SPSS, Stata, and SAS • Some features in R get "ported" to Python • Some features in Python get "ported" to R • Use SQL in R or Python Where to Start? • Data Management • Many, many, functions • Python: pandas • R: tidyverse, data.table, or sqldf • Statistical Analysis • Formula Notation • Python: statsmodels • R: base R, afex/car (ANOVA), lme4 (Mixed Models), etc. • Graphing • Python: seaborn (uses matplotlib) • R: ggplot2, ggformula, or lattice Interpreting Tutorials Recognizing Packages In R: • library(package) • require(package) • package::function() In Python: • import package as nickname • nickname.stuff What to Look For Functions & Methods: Objects: Learning Packages & Functions Built-In Datasets R • data() #see all the datasets included • data(name) # make it available Python • Some options, but none great • Just use R's • https://vincentarelbundock.github.io/Rdatasets/ • Both allow URLs in read_csv Creating Data Use Vectors It is very common to use vectors as variables without making them into a dataframe. A <- c(1,2,3,4,5) B <- c(7:20, 200) t.test(A,B) group <- 1:2 value <- rnorm(20) data <- data.frame(group, value) t.test(value ~ group, data=data) extension is .ipynb Jupyter Notebook Our InfoGuides • https://infoguides.gmu.edu/learn_r • https://infoguides.gmu.edu/learn_python • DataCamp • Carpentries • CodeSchool • Coursera Jamovi jamovi.org Syntax Mode Comments # Descriptives Functions Packages jmv::descriptives( data = data, Data Specification vars = "fate") Arguments Values.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    38 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us