<<

11/11/2012

Software Skills for Researchers

Applications are an integral part of any type of research

• Today’s researcher needs a wide range of software use and management skills: – Software evaluation and selection – Hardware device evaluation and selection – Advanced skills in general use software Software for Analysis – Advanced user skills in selected specialized software packages October 30 & 31, 2012 • Also needs Skills – File Management – Data/File Conversion – Data Management Kevin J. Comerford, MS, MFA – Data Archiving Assistant Professor / Digital Initiatives Librarian

DataOne Position Description US Government Position Description (Climate Scientist) (Social Scientist)

Required Qualifications: Candidates For GS-11 Grade Lev el: should have . . . expertise in advanced A. One year of specialized experience interactive visualization techniques . . . equivalent to the GS-9 grade level in The candidate must have knowledge Federal service, providing analytic and practical experience in developing support for policy analysis and social and using visualization software such science research related to evaluating as VisTrails or UV-CDAT or other legislative, regulatory and/or the advanced Visualization packages. . . delivery of public health or human Experience with acquiring and services programs; performing managing spatial data . . . The quantitative analyses related to health candidate must have experience in and human services programs and using Python, PERL, or other policy utilizing statistical software such languages for managing high-volume as STATA, SPSS, SAS or equivalent complex data. The candidate should statistical software. Software Skills are vital to Software Skills are vital to have familiarity with UNIX and all areas of research all areas of research Windows operating systems.

Research Support Applications Software Categories

– Qualitativ e Analy sis • Project Management Software • Enables collection and interpretation of behavioral data – Atlas.ti

– Basecamp, MS Project, others – Quantitativ e/Statistical Analy sis – Wikis, Blogs • Provides tools that allow the relationships between data elements to – Lab Notebook Software be expressed in mathematical terms – SAS/SPSS

• Workflow Software – Content Analy sis – General Workflow Management (Visio, Dia) • Provides search, comparison and analysis tools for large collections – Experimental/Scientific Workflow (MyExperiment, Kepler) of text-based documents - LightSide

• Administrative Management – Data Acquisition Sof tware – Office Applications, Email, Scheduling • Often paired with hardware devices, captures data from sensors, experimental devices - LabView – MS Office, OpenOffice, LibreOffice – Geographic/Mapping (GIS) • Provides geospatial context to data - ArcGIS

1 11/11/2012

Data Analysis Software Categories Selecting and Purchasing Data Analysis Software – Mathematical • Performs advanced, abstract mathematical functions - MATLAB Analysis software isn’t just for analysis anymore… – Modeling & Simulation • Related to visualization, adds time and space factors to data analysis

– Analy sis Programming Languages • Enable programmers and skilled researchers to write customized analysis functions and scenarios – R, Fortran

– Visualization • Transforms data into visually identifiable scales and relationships

– Specialty • Does your data analysis softw are meet • Performs functions that are unique to a field of study or analysis your needs for all stages of the data

– Others… lifecycle?

Selecting and Purchasing Data Analysis Software

• Use the right tool for the right job

• Core and specialty applications are expensive

• Most applications are hybrid – serving multiple purposes

• Look for open source

• Look for free/web-based tools QUALITATIVE ANALYSIS • UNM Licensed software is available SOFTWARE • Student/Educator pricing

Qualitative Software Features Qualitative Analysis Software

• Codebook management • Point-and-click coding • Atlas.ti (http://www.atlasti.com) • Auto Coding • Margin notes • NVIVO (http://www.qsrinternational.com) • Weighting values • Content linking • QDA Miner (http://provalisresearch.com) • Multimedia Analysis • Transcription tools (for A/V data) • Content Analysis • Survey import/management – LightSide (http://www.cs.cmu.edu/~emayfiel/side.html) • Reporting and summarization

2 11/11/2012

Atlas.ti Atlas.ti

• Summary: “The purpose of ATLAS.ti is to help researchers uncover and systematically analyze complex phenomena hidden in text and multimedia data. The program provides tools that let the user locate, code, and annotate findings in primary data material, to weigh and evaluate their importance, and to visualize complex relations between them.”

• Used in: Anthropology, Sociology, Psychology, Business, Marketing, Use Studies

• Product : http://www.atlasti.com

• Product Pricing: $99 student rate with ID

• Special Function Add-ons: – Geospatial Data Plotting – Online Survey Management – Data Visualization

• Platform Availability: – Windows – Mac OSX

Atlas.ti

QUANTITATIVE ANALYSIS SOFTWARE

Quantitative Software Features Quantitative Analysis Software

• Analysis of variance • Regression • SAS (http://www.sas.com) • Categorical data analysis • Multivariate analysis • SPSS (http://www-01.ibm.com/software/analytics/spss/) • Survival analysis • Psychometric analysis • STATA (http://www.stata.com/)

• Cluster analysis • Microsoft Excel (!) • Nonparametric analysis • Survey data analysis • Many others • Compare data against common distributions • Imputation for missing values

3 11/11/2012

SPSS SPSS (UNM Licensed)

• Summary: “With SPSS predictive analytics software, you can predict with confidence what will happen next so that you can make smarter decisions, solve problems and improve outcomes.”

• Used in: Business, Anthropology, Sociology, Psychology, Business, Marketing, Computer Use Studies

• Product Information: http://www-01.ibm.com/software/analytics/spss/

• UNM Product Pricing: $79 – (http://it.unm.edu/software/f aculty- staff/win dows/in dex.h tml) – Also available in UNM Computer Labs

• Add-On Features – Collaboration – Excel Interface – Data Collection

• Platform Availability: – Windows – Mac OSX

SAS (UNM Licensed) SAS

• Summary: “From traditional analysis of variance and predictive modeling to exact methods and statistical visualization techniques, SAS/STAT software provides tools for both specialized and enterprise-wide analytical needs.”

• Used in: Business, Economics, Finance, Natural/Physical Sciences

• Product Information: http://www-01.ibm.com/software/analytics/spss/

• UNM Product Pricing: $120-170 yearly (http://it.unm.edu/software/faculty-staff/windows/index.html)

• Add-On Features – Scripting Language – Data Visualization – Advanced Analytics Module – Mapping/GIS

• Platform Availability: – Windows – Mac OSX

SAS

DATA VISUALIZATION SOFTWARE

4 11/11/2012

Data Visualization Visualization Feature Sets • Wikipedia Lists 45 applications (http://en.wikipedia.org/wiki/Data_visualization#Data_visualization_software)

• Mapping data sets (down to level of US states and counties). • Microsof t Excel (!) • Broad range of charts and plots: • MATLAB (UNM Licensed) • Scatter, line, area, bubble, multiple axis, overlay. • Tableau Desktop • Bar, pie, donut, star, . • Customized colors, line styles, symbols. • TrendAnaly zer • 2-D and 3-D plots with tilting and rotation. • VisTrails • Generate static or dynamic interactive (Java or • Visual.ly (http://visual.ly) ActiveX) charts and graphs with drill-down capabilities. • Link graphs to Web pages. • Many Specialized Applications • Embed interactive graphics in Web pages or Microsoft – Climate Visualization documents. • UV-CDAT • Support for common types of printers and plotters. • NCAR

• Summary: Excel provides a MATLAB (UNM Licensed) MS Excel unified container for collecting, storing and visualizing any form of data • Summary: “MATLAB is a high-level language and interactive environment

for numerical computation, visualization, and programming” • Used in: Engineering,

Natural/Physical Sciences, Social Sciences • Used in: Engineering, Mathematics, Natural/Physical Sciences, Statistics

• Product Information: • Product Information: http://www.mathworks.com/products/matlab/ http://office.microsoft.com/en- us/excel • UNM Download: http://it.unm.edu/download/

– Also available in UNM Computer Labs • UNM Licensed: http://it.unm.edu/down load/ • Special Function Add-ons: • Special Function Add-ons: – Data Acquisition – DataUp – connectivity – Signal Processing • Platform Availability: – Image Processing – Windows – Mac OSX • Platform Availability: – Windows – Mac OSX – Linux – Mobile

• MATLAB Video: http://www.mathworks.com/videos/an alyzing -and -visualizin g-d ata-wit h-m atlab -70 942. html

MATLAB (UNM Licensed) MATLAB (UNM Licensed)

Features • Mathematical functions for linear algebra, statistics, Fourier analysis, filtering, optimization, numerical integration, and solving ordinary differential equations

• Functions for integrating MATLAB based with external applications and languages such as C, Java, .NET, and Microsoft Excel

• Development tools for improving code quality and maintainability and maximizing performance

• Tools for building applications with custom graphical interfaces

Features • Built-in graphics for visualizing data and tools for creating custom plots

• High-level language for numerical computation, visualization, and application development

• Interactive environment for iterative exploration, design, and problem solving

5 11/11/2012

Tableau Desktop Tableau Desktop

• Summary: “Tableau Desktop is based on breakthrough technology from Stanford University that lets you drag & drop to analyze data. You can connect to data in a few clicks, then visualize and create interactive dashboards with a few more. Shift fluidly between views, following your natural train of thought”

• Used in: Business, Economics, Finance, Social Sciences

• Product Information: http://www.tableausoftware.com/products/desktop

• Free Trial Available

• Platform Availability: – Windows – Mac OSX – Mobile

Programming Languages

• C/C++ • FORTRAN (still around)

• Python (open source)

• R (open source) DATA ANALYSIS • S PROGRAMMING • Embedded Languages LANGUAGES – SAS – MATLAB

R R Example Code

f or statistical data analy sis and Graphics

• Extremely popular f or Quantitativ e Data Visualization

• Programming Tools are f ree, open source

• R Project website: http://www.r-project.org/

• R Website includes Tutorials, Manuals, Training

• Av ailable on Windows, Mac, Unix

6 11/11/2012

Poster developed from R

7