Subject Area(S): Mathematics, Computer Science
Total Page:16
File Type:pdf, Size:1020Kb

The Research Experience for Teachers Program http://www.cs.appstate.edu/ret
Subject Area(s): Mathematics, Computer Science
Computer Science Tools: Octave
Activity Title: Choosing a College: Using Octave to Analyze and Compare Big Data
Grade Level: 9th -12th
Time Required: 90 minutes
Recommended Group Size: 1
Summary: Students will explore concepts related to analyzing a large set of data, without actually viewing the values. Students will use index expressions, iteration, booleans, and statistical functions to analyze a matrix of 1000+ colleges to determine what are the top colleges for them based on their preference criteria.
Keywords: data, matrices, Octave, statistics, weighted values
Pre-Requisite Knowledge: Basic knowledge of matrix structure, arithmetic, and descriptive statistics
Learning Objectives: Students will be able to load large data files into Octave, navigate large matrix sets, and use Octave commands to perform calculations and solve data analysis problems.
Materials List: laptop, Octave software, data files, and student lab Instruction sheet.
Introduction/Motivation: Choosing a college can be a life changing decision. Most often this decision is made based on name recognition or parental & peer pressure. How do you know if it’s the right college for you? What if you could analyze the data from more than 1000 colleges without the bias of seeing their names? You may have only considered choosing between half a dozen colleges you are familiar with already. There are a lot more options out there if you know how to manage the data.
Preparation:
Make sure Octave software is available for students Octave is a free open source software that can be downloaded from: https://www.gnu.org/software/octave/ Insure that students can access and upload the data files. It is easier to load the data files if they are in the same current working directory. For more information: http://wiki.octave.org/Cookbook#Load_comma_separated_values_.28.2A.csv.29_files
Download the 3 data files collegedata.txt, collegenames.txt, and labels.txt from the associated zip file on the Appalachian State RET Repository
Instruct students where to find Octave and data files on their computer. It will be easier if the data files are saved in the Octave bin folder. Otherwise students will need to change their working directory before they start.
Extension: Students can complete a much more extensive college search using more categories of data and/or choosing what categories are most important to them. Many more variables can be accessed at http://nces.ed.gov/ipeds/datacenter/Default.aspx where they can create custom data files that can be downloaded as a CSV file into a spreadsheet. The data will need to be cleaned up in before copying and pasting into a NoteBook text file. This would include deleting any rows with empty cells, formatting columns to a specific decimal place, and then copying the numerical data into one text file. The row names and column labels should be copied and pasted into separate txt files.
Further Extension: Students could find large data sets that interest them and use the concepts learned to analyze another type of data. A good data resource is http://www.gapminder.org/data/
Resources:
National Center for Education Statistics (NCES) Home Page, a part of the U.S. Department of Education. (n.d.). Retrieved April 01, 2016, from http://nces.ed.gov/