Exploratory Data Analysis

Logo

Data Science Institute
Vanderbilt University


Course Overview
Course Materials
Course Policies

View the Project on GitHub dsi-explore/eda-course-website

Week 9 Homework Assignment

Due Wednesday, October 23rd by 5PM.

Summary

In Week 5 you were asked to “Choose your own adventure with data.” You can easily revisit this assignment on our course webpage.

For this week, we would like you to return to the data you chose during week 5 and write a report further exploring the data. (Note: you may change the data if you already learned your first choice was poor, but just explain this in your report and mark the google doc accordingly).

Tasks

For this week’s report you should:

  1. Choose a motivating question which will guide the analysis you present in the report. This can be the same question from your Week 5 homework, or something different.
  2. You can import ‘clean’ data from your previous work, but if you do any new data cleaning or wrangling, summarize it here.
  3. Following the in-class example, apply cluster analysis to explore your data and interpret your results to the best of your ability.
  4. If cluster analysis does not make sense for your data explain WHY it does not make sense.
    • If cluster analysis is not relevant, tell us what the best tool in your current toolkit is for exploring the data. Are alluvial plots useful? Time series graphics? Boxplots? Justify your choice.
    • If you are unable to conduct cluster analysis, show us a demo-analysis using demo data such as the Iris dataset (hint: use the data() function to see what demo data is available).