Assignment 3: Exploratory Data Analysis


In groups of 3-4, identify a dataset of interest and perform exploratory analysis in Tableau to understand the structure of the data, investigate hypotheses, and develop preliminary insights. Prepare a PDF or Google Slides report using this template outline: include a set of 10 or more visualizations that illustrate your findings, one summary “dashboard” visualization, as well as a write-up of your process and what you learned.

Submit your group’s report url and individually, your peer assessments for A3 by Monday 2/17, 11:59pm.

Week 1: Data Selection

First, choose a topic of interest to you and find a dataset that can provide insights into that topic. See below for recommended datasets to help you get started.

After selecting a topic and dataset – but prior to analysis – you should write down an initial set of at least three questions you’d like to investigate. Prepare the data (i.e. do any cleaning you need), and make 1 chart in Tableau.

Week 2: Exploratory Visual Analysis

Next, perform an exploratory analysis of your dataset using Tableau. You should consider two different phases of exploration.

In the first phase, you should seek to gain an overview of the shape & stucture of your dataset. What variables does the dataset contain? How are they distributed? Are there any notable data quality issues? Are there any surprising relationships among the variables? Be sure to also perform “sanity checks” for patterns you expect to see!

In the second phase, you should investigate your initial questions, as well as any new questions that arise during your exploration. For each question, start by creating a visualization that might provide a useful answer. Then refine the visualization (by adding additional variables, changing sorting or axis scales, filtering or subsetting data, etc.) to develop better perspectives, explore unexpected observations, or sanity check your assumptions. You should repeat this process for each of your questions, but feel free to revise your questions or branch off to explore new questions if the data warrants.

Final Deliverable

Your final submission will be a written report, 10 or more captioned “quick and dirty” Tableau visualizations outlining your most important insights, and either a live link or a screen-capture video of one summary Tableau dashboard that answers one (or more) of your chosen hypotheses. The dashboard should have multiple charts to communicate your findings on an ongoing basis, assuming you’ll continue to collect data over time: imagine you are a data analyst preparing a dashboard for the CEO of your company who can look at key metrics (regarding your hypothesis) every month. In your written sections, focus on the answers to your initial questions, but also describe surprises as well as challenges encountered along the way, e.g. data quality issues.

Each visualization image should accompanied with a title and short caption (<2 sentences). Provide sufficient detail for each caption such that anyone could read through your report and understand your findings. Feel free to annotate your images to draw attention to specific features of the data, keeping in mind the visual principles we’re learned so far.

To easily export images from Tableau, use the Worksheet > Export > Image… menu item.

Grading Criteria

Data Sources

Tableau Resources

Additional Tools

Your dataset almost certainly will require reformatting, restructuring, or cleaning before visualization. Here are some tools for data preparation: