Assignment 3: Exploratory Data Analysis


In groups of 3-4*, identify a dataset of interest and perform exploratory analysis in Tableau to understand the structure of the data, investigate hypotheses, and develop preliminary insights. Prepare a PDF or Google Slides report using this template outline: include a set of 10 or more visualizations that illustrate your findings, one summary “dashboard” visualization, as well as a write-up of your process and what you learned.

* You may group with whomever you wish. An outdated version of this page prior to 9/25 required “new” groups; it was incorrect.

Submit your group’s report url and individually, your peer assessments for A3 by Monday 9/30, 11:59pm.

Week 1: Data Selection

First, choose a topic of interest to you and find a dataset that can provide insights into that topic. See below for recommended datasets to help you get started.

If working with a self-selected dataset, please check with the course staff to ensure it is appropriate for this assignment. Be advised that data collection and preparation (also known as data wrangling) can be a very tedious and time-consuming process. Be sure you have sufficient time to conduct exploratory analysis, after preparing the data.

After selecting a topic and dataset – but prior to analysis – you should write down an initial set of at least three questions you’d like to investigate.

Week 2: Exploratory Visual Analysis

Next, perform an exploratory analysis of your dataset using Tableau. You should consider two different phases of exploration.

In the first phase, you should seek to gain an overview of the shape & stucture of your dataset. What variables does the dataset contain? How are they distributed? Are there any notable data quality issues? Are there any surprising relationships among the variables? Be sure to also perform “sanity checks” for patterns you expect to see!

In the second phase, you should investigate your initial questions, as well as any new questions that arise during your exploration. For each question, start by creating a visualization that might provide a useful answer. Then refine the visualization (by adding additional variables, changing sorting or axis scales, filtering or subsetting data, etc.) to develop better perspectives, explore unexpected observations, or sanity check your assumptions. You should repeat this process for each of your questions, but feel free to revise your questions or branch off to explore new questions if the data warrants.

Final Deliverable

Your final submission will be a written report, 10 or more captioned “quick and dirty” Tableau visualizations outlining your most important insights, and one summary Tableau “dashboard” visualization that answers one (or more) of your chosen hypotheses. The “dashboard” may have multiple charts to communicate your main findings, and may be submitted as a static image or short animation without sound (e.g. to show mouseover behaviors.) In your written sections, focus on the answers to your initial questions, but be sure to also describe surprises as well as challenges encountered along the way, e.g. data quality issues.

Each visualization image should accompanied with a title and short caption (<2 sentences). Provide sufficient detail for each caption such that anyone could read through your report and understand your findings. Feel free to annotate your images to draw attention to specific features of the data, keeping in mind the visual principles we’re learned so far.

To easily export images from Tableau, use the Worksheet > Export > Image… menu item.

Grading Criteria

Data Sources

Tableau Resources

Additional Tools

Your dataset almost certainly will require reformatting, restructuring, or cleaning before visualization. Here are some tools for data preparation: