Word Processing Homework
Total Page:16
File Type:pdf, Size:1020Kb
Project Phase 1 Instructions Data Acquisition
Overview
The purpose of this phase of the project is to acquire the data you will later analyze with Excel and Access. You may choose a dataset about any topic that interests you: sports, demographics, economics, weather, etc. Any dataset is acceptable as long as it meets the requirements below. For your convenience, links to datasets that will work well for this project are provided on the last pages of this document.
Detailed Instructions
Dataset Requirements:
1. Your dataset must contain at least 500 cells of data.
2. You may choose to download your dataset from the web or obtain your dataset from a business or other organization you are affiliated with. You may NOT generate a dataset from scratch. You may combine datasets or add data to an acquired dataset. You are not encouraged to pay for data or to join any organization in order to gain access to their data. For your convenience, links to datasets that will work well for this project are provided on the last pages of this document.
3. The data contained in your dataset does not have to be 100% accurate, but you should try to use data that appears realistic.
4. Remember, you will be analyzing this dataset with Excel and Access in later phases so you should choose data that you can manipulate with mathematical (ie., min, max), statistical (ie., sum, average) and/or logical (ie., if, lookup) functions. You’ll also need to have some way of grouping data into subsets; text or code fields are best for grouping. The best dataset would have both text and numeric fields. If you need help determining if your desired dataset meets this requirement, please see your instructor during office hours.
ACIS 1504 – Project Phase 1 Page: 1 of 7 Acquisition of Data:
5. If you are looking for data on the web, you should try to find files with one of the following file types: .csv (comma delimited), .txt (text), .xls (Excel) or .xlsx (Excel). Potential sources of datasets are provided on the last pages of these instructions.
6. Save the initial dataset because you will be required to submit it with your spreadsheet file. If the dataset you download is an Excel file, be sure to save it separately from the Excel file you create below. If you copy data directly from the web to Excel rather than downloading a file, you can save a screenshot of the website as your “initial dataset”. No matter how you acquire your data, you need to create some kind of file that displays the data as you acquired it.
7. After you have acquired your dataset, you should open (or import) your data into Excel and save it as a file separate from the original dataset. Your Excel file should be named after your VT email address. For example, [email protected] would be hokie95.xlsx, and if hokie95 had a partner whose email is [email protected], the project would be named hokie95_jsmith.xls.
8. Prepare the data for analysis. You will follow instructions for Phase 2 to complete the analysis at a later time. You may find that your dataset is ready for the analysis phase, but often, a little housekeeping must be done first. For example:
You may need to familiarize yourself with each column contained in the dataset. Sometimes the dataset you downloaded requires you to download a separate file containing a key to abbreviations used in the dataset. If your dataset does not contain column headings that make sense to you, this might be a good time to create new column headings.
You may need to clean the data a bit. If some of the rows or columns appear to contain invalid data, or extra data that you do not want to use in your analysis, you might delete those rows or columns.
Some housekeeping chores require the use of Excel functions. Such tasks should be delayed until Phase 2 because we will not practice the needed functions until after Phase 1 is submitted. These chores include: o Text functions to separate one column of data into multiple columns. o Logical functions to add text values to your data if they are lacking. For example, you might use an IF function to label each row High or Low based on that row’s numeric value in one column. o If you use a function to create new data items, you should use Copy & Paste, Values to replace these functions with the data itself. Do NOT Paste Values for functions in Phase 2.
ACIS 1504 – Project Phase 1 Page: 2 of 7 Formatting:
9. Sort the dataset in a logical order. Even if you don’t want to change the original order, you should enter the sort criteria in Excel. Include a primary sort as well as a secondary sort. Since a secondary sort is required, you should not use a field containing unique values as your primary sort. For example, if you sorted a list of students by Student ID number, you would find the list is in numeric order and any secondary sort would be useless. On the other hand, if you sorted by last name, all students with the same last name would need a secondary sort to put them in order. Primary and secondary sort does not mean to sort your data twice. You must apply the primary and secondary sort keys at the same time so they are both used when needed.
10. Add any necessary headings, sheet names, etc. to clarify your dataset.
11. Add formatting to clearly identify the heading, data and calculation zones of your dataset. You won’t have any calculations incorporated until Phase 2, but you can format the cells in which the calculations may eventually reside.
12. Add conditional formatting to highlight some portion of the dataset.
13. Do not print your dataset, but set it up for printing by including print titles to keep the row and column headings visible on all printed pages.
14. Set freeze panes so that column and row headings remain on screen at all times.
Upload Files
Follow the steps below to submit your original data file and your Excel spreadsheet file.
A. In your Internet browser, go to scholar.vt.edu. Please note that Scholar’s dropbox does not work consistently in Safari or Chrome. Firefox and Internet Explorer are compatible browsers.
B. At the top right of the web page, log in using your VT username and password.
C. Select the 1504Proj1 tab in the maroon bar near the top of the web page. If you do not see the proper tab, select the My Workspace tab then Membership. Select S1 Project and continue with the next instruction. Note: the S1 Project tab will not appear in Scholar until the second week of the semester.
D. Select Dropbox from the small menu at the left of the web page.
E. Select the drop-down arrow in the box labeled “Add” near the center of the web page.
F. Select Upload Files from the pop-up menu.
ACIS 1504 – Project Phase 1 Page: 3 of 7 G. Click the Browse button.
H. Locate your solution file. Be sure to check the last modified date to ensure you are uploading the intended version of your solution file. If the last modified date is not displayed, select View to Details.
I. Double-click the solution file to select it.
J. Do NOT send an email notification to your instructor.
K. Click the Upload Files Now button.
L. Logout of Scholar using the link at the top right of the web page.
ACIS 1504 – Project Phase 1 Page: 4 of 7 POTENTIAL SOURCES OF DATASETS You are not required to choose from one of these sites. You may find that some links on these sites are to data in the wrong format or too small for use in this project. I thought these sources looked promising and might serve as a starting point for those students who have no idea what dataset they might want to work with this semester.
As you select data from these or other sites, make sure the dataset has numbers you can use in calculations (ie. Something you can total or average) as well as data you can group by (ie., gender, team, role, year, etc.) Sports
Baseball http://www.seanlahman.com/baseball-archive/statistics/ Select the comma-delimited version. When you unzip the downloaded file you’ll have several files to choose from. View the ReadMe file to learn what the column heading abbreviations stand for. All of the following files should have enough data to be useful: AllStarFull Pitching Batting Salaries Fielding SeriesPost HallOfFame Teams
ACIS 1504 – Project Phase 1 Page: 5 of 7 Basketball http://www.databasebasketball.com/stats_download.htm When you unzip the downloaded file select Player_AllStar or Team_Season.
Football http://www.pro-football-reference.com/years/ Select a League. Select any Totals category. Select the Export link next to any table of data that interests you.
Or look for sites that offer data about your favorite sport. Wide Variety
U.S. Census Bureau http://factfinder.census.gov/faces/nav/jsf/pages/guided_search.xhtml Follow the steps in the Fact Finder to build a custom dataset from census data. Step 6 offers you the option to download the data you’ve selected.
U.S. Federal Datasets http://catalog.data.gov/dataset This site contains loads of interesting datasets. Find a topic that interests you and select the CSV or EXCEL links.
USA.Gov https://www.usa.gov/statistics#item-37157 Scroll down to the Federal Government Data and Statistics section. Select an agency that collects data you might be interested in and search their site.
ACIS 1504 – Project Phase 1 Page: 6 of 7 Other
California Department of Education www.cde.ca.gov/ds/sd/sd/ Scroll down past the first section to find lots of datasets about different aspects of education.
National Science Foundation www.nsf.gov/statistics/tables-by-survey.cfm Select a survey that interests you. Choose the DATA tab then a DATA TABLE. Choose the Excel format of the data of interest.
Pew Research Internet Project www.pewinternet.org/datasets/pages/3/ The dated entries on this page are actually links to datasets. You need to identify who you are and that you plan to use the data for a class project before you can access the data files.
U.S. Department of Education www.ed.gov/developer Select one of the CSV links.
Weather www.ncdc.noaa.gov/cdo-web/datasets This site is setup like typical retail site. You pretend you are shopping for data but you don’t actually have to pay for it. On this screen, you can browse the data products and view small samples of what your finished file will look like. When you find a sample you like, you can use the SEARCH or MAP TOOL to identify specific geographic locations and other parameters. When you checkout, again no payment is required, you give your email address and the dataset is sent to you. Be sure to select the CUSTOM…CSV option when presented with the choice of file format.
If you locate another good source of data, be sure to let me know so I can add it to the list.
ACIS 1504 – Project Phase 1 Page: 7 of 7