Data Analysis in

Course at a Glance

This course will provide an introduction to reproducible data analysis with R (see Syllabus).

Instructor

Gabriel Baud-Bovy ([email protected] )

Credits: 5

Synopsis

This course aims at giving to the student a methodology to analyze experimental results, from how to organize data to the writing of a report. It includes:

 an introduction to R  an introduction to reproducible research with R  examples of statistical analysis with R

During this course, the student will have to analyze his own data and is expected to read before each course the material that will be made available on this page. The final grade will consist in the evaluation of a report demonstrating familiarity with the concepts and methods presented in the course.

As an editor, the instructor will use Notepad++ (together with NppToR) on a Windows Machines but the student might use other ones (e.g., R studio, EMACS+ESS, Lyx, TexWork). The course will use Mardown as typesetting language. For those desirous to work with Latex and/or generate pdf, you will need to install also MikeTex.

Syllabus

Total of 15 hours

 Class 1: Case study, Reproducible research  Class 2: R fundamental  Class 3: Exploratory data analysis and graphical methods in R  Class 4: Basic statistics  Class 5: To be determined

There will be a final examination decided by the instructor.

Prerequisites

 The course assumes some familiarity with programming concepts and data structures (MATLAB, C/C++, Java or any other programming language). Contact the instructor if you have never programmed anything.  Install R on your laptop.  Bring a data set that you want to analyze. Reading List

 Baud-Bovy (2014) Notes on reproducible research with R  R Core Development Team “An introduction to R".

References

 Baud-Bovy (2014) Notes on reproducible research with R. Draft  R Core Development Team “An introduction to R".

Onlines (optional)

 The R Project website: http://cran.r-project.org/ (to download R, R packages and R documentation)  pandoc: http://johnmacfarlane.net/pandoc/index.html  Notepad++: http://notepad-plus-plus.org/  NppToR: http://sourceforge.net/projects/npptor/  MikeTex: http://miktex.org/  R studio: see http://www.rstudio.com/.  knitr website: http://yihui.name/knitr/

References books (optional)

Dynamic (2013) Documents with R and knitr. CRC Press.  Christopher Gandrud (2014) Reproducible Research with R and RStudio. CRC Press

. Introductory Statistics with R.  Robert Kabacoff . R in Action.  Brian S. Everitt, Torsten Hothorn. A Handbook of Statistical Analyses Using R.  Norman Matloff. The Art of R Programming: A Tour of Statistical Software Design .

Venue

Istituto Italiano di Tecnologia, Via Morego 30, Bolzaneto, Genova

Course dates

March/April 2014