ESPM 288: Reproducible and Collaborative Data Science


Instructor Carl Boettiger
GSI Dana Paige Seidel
Location VLSB 2066
Times Tu 8:10A-9:00A, Th 4:00P - 6:00P
CCN 39303

Overview

This course is designed for graduate students regardless of prior experience. We aim to be accessible to those new to programming, but those who have been using R for years will find new material in this opinionated and fast moving area. The course is project focused and centered around five modules reproducing important results in global change ecology. A series of short tutorials will introduce relevant technology, but most concepts will be first introduced in reading outside of class, leaving class time to focus on the more complex examples encountered in the modules.

Approach

This course will use a flipped classroom model, with new material introduced in reading assignments prior to class while class time will focus on applying these skills to explore interesting data sets. We will move though four modules, each introducing a new data set and new scientific questions, while also introducing a new skill area and building on previous skills. Students will be expected to work collaboratively in and out of class, and course content and grading will emphasize communication and reproducibility of an analysis as much as scientific or technical completeness. The Course Syllabus provides an overview of the modules and topics covered as well as links to weekly reading, assignments, and any lecture material. This syllabus is preliminary and always subject to change.

Texts

We will use Grolemund and Wickham's R For Data Science as the primary text for this course. A hard copy of the book is not required and the openly licensed full text can be found on the author's website for this book. Additional reading material will be linked from the syllabus. Please be sure to review the relevant reading prior to each class session.