Exploratory Data Analysis

Logo

Data Science Institute
Vanderbilt University


Course Overview
Course Materials
Course Policies

View the Project on GitHub dsi-explore/eda-course-website

Week 5: Homework assignment: Choose your own adventure (with data)

DSI-EDA

Professor Cassy Dorff

Part One

  1. Choose a data set from: This Google Spreadsheet

  2. Once you have chosen your data, make a ‘comment’ in the first column and first row of your data as shown in the google spreadsheet. The comment should simply be your name. Once you’ve done this, no one else can use the data. If there is already a comment with someone’s name, you cannot use it.

Part Two

  1. Write 3-5 sentences telling me about this data. What was it used for? Why was it collected? What phenomenon does it capture? Who created this data, and why?

  2. Explore the data. What are the dimensions of the data? What types of variables are there?

Part Three

  1. Center your analysis on an exploratory (or “motivating”) question. Tell me what it is.

  2. Explore your question using visualizations. In the end, choose two ‘final’ plots to ‘print’ and interpret the visualization in 2-5 sentences.

Part Four

  1. Answer the following questions:

Part Five

  1. Save your ‘write up’ report in your homework repo. Knit to pdf file for easy online viewing.
  2. You write up should include all of the answers to the above questions. You can EITHER do your analysis and write up in one document, or you can save an R script for separate analysis, such as data cleaning, and reference it in your write up. All of your work should be shown in your repository for this assignment