Human beings rely on summarizing and visualizing data to make informed decisions. The number and volume of data continues to increase at exponential rates, and new user-facing systems and modalities are needed to handle the scale and heterogeneity of future data. This course surveys the landscape interactive data exploration systems along several axes.
Staff+OH | Eugene Wu (Instructor) | Weds 4-5PM |
Thibault Sellam | By Appt | |
Meetings | Weds 2-4PM 503 Hamilton Hall | First 1.5 hours paper discussion Last 30 minutes open discussion. |
Units | 3 | |
Grading | Questions | 10% |
Participation | 15% | |
Assignments | 15% | |
Project | 60% | |
Presentation | 0-10% extra credit | |
If publishable quality | >10-20% extra credit | |
Communication | Piazza | Aside from personal questions, use Piazza instead of email. |
Course Github |
What This Class is NOT
What I expect from You
For assignments, you allowed 5 penalty free late days to use throughout the semester. One late day equals one 24 hour period after the due date of the assignment. Once you have used your late days, there will be a 20% penalty for each day an assignment is late. You do not need to explictly declare the use of late days; we will assign them to you in a way that is optimal for your grade when different assignments are worth different numbers of points. Late days may not be used for the final project.
You will pursue a semester long research project related to this course. The project is a significant part of the course grade.
You are expected to answer the short questions associated with the readings every course. The class reviews must be submitted by 9PM the day before class.
Add your answers to the appropriate lecture’s wiki page
You have the option to present as a group (1-2 people) for one lecture on a topic/paper of your choice (within reason). The paper(s) you select can be from the list given below. You are also free to list a paper of your choice as long as it matches the themes of the class. This list must be submitted by midnight Feb 1.
You will be asked to complete three milestones for the presentation. Their purpose is to ensure high presentation quality—it is also a good excuse to practice your presentation skills and get feedback:
Submit the teammates and papers to present
Day | Presenter | Papers | Notes/Due |
---|---|---|---|
1/18 | Eugene | Introduction | |
1/25 | Eugene | Specification. | Readings + Qs HW 1 |
2/01 | Eugene | Performance overview, end-to-end systems | Readings + Qs Submit presentation requests Turn in project teams in class! |
2/08 | Eugene | Sampling | Readings + Qs Project Prospectus Due Stream HW 1 released |
2/15 | Eugene | Prefetching/Network | Readings + Qs Stream HW1 due Stream HW 2 released |
2/22 | Gabriel/Daniel | Specialized Systems: Macrobase (and BlinkDB?) | Readings + Qs Evaluation functions for Stream HW2 due. |
3/01 | Alireza/Luren | Dremel | Readings + Qs Prediction functions for Stream HW2 due 3/5. |
3/08 | Eugene | Explanation + Midpoint Review | Readings + Qs HW 4 is out |
3/15 | NO CLASS. Spring Break! | ||
3/22 | Thibault | Modalities | |
3/29 | Brennan/Drashko | Recommendation + Summarization | |
4/05 | Patrick Shafto (guest lecture) | TBA | |
4/12 | Eugene | Work on projects in class. Thibault+Wu will help. | |
4/19 | Thibault | Web Tables. Wu (may) be away at ICDE | Readings + Qs HW 5 is out |
4/26 | Eugene/Thibault | Cleaning? | |
5/03 | Poster Presentation + submit writeups by 5/5 |
Background you should be comfortable with
Classics
Surveys
Declarative Visualization Languages
Interaction Modalities
Recommendation
Autocomplete and refinement
Explanation
Data Cleaning
End-to-end fast data visualization systems
Data Processing Systems
Prefetching
Sampling
Network
Neat applications