Data Visualization

Note: Offered Spring 2025, first eight weeks course

Materials

Course Description


Data visualization is the art and science of effective and enticing presentations of data and statistical results. Topics covered range from exploratory data analysis techniques to methods for presenting complex model results. The course focuses both on the science of best data visualization practices as well as software implementation. We will use Stata for the course though resources for R will also be provided.

 

Understanding data and effectively presenting model results are challenges that data analysts face almost every day. There is seldom a more effective solution than a well thought out visualization. Problems in the data are easily identified; complex effects are quickly summarized; effect sizes and variability are immediately clear.

 

In this course, we will cover best practices for accurately representing data as well as many specific approaches to data exploration, model diagnostics, and model presentation. The primary focus is on the applied analyst’s “bread and butter” types of visualizations: those that will be useful in most every research project. However, we also cover more advanced visualization methods.

 

The course is structured with four sections: univariate distributions, bivariate relationships, model results, and advanced topics. Specific topics covered include: transforming distributions; pie, bar, dot, and radar plots; balance plots; plotting change over time; choropleth maps; coefficient plots; plots of predictions; marginal effects plots; interaction effects; nonlinearity; model diagnostics; and some miscellaneous advanced topics.