
Spring 2026
| Logistics | âšī¸ |
|---|---|
| đ¤ Instructor | Carl Boettiger |
| đ Location | Room 166 - Social Sciences Building |
| đ Days | Monday / Wednesday |
| â° Time | 10:00 - 11:30 |
Overview
Welcome to ESPM-288: Reproducible & Collaborative Data Science.
This Spring 2026 semester, we explore the Agent-First Methodology for environmental data science. We will leverage the R ecosystem (including duckdbfs, tidyverse, mapgl, gdalcubes ellmer, vitals, mcptools, ragnar, & shiny) to architect, audit, and deploy robust data solutions for complex environmental problems. We will focus on high-performance pipelines for larger-than-RAM data (e.g. parquet/geoparquet, COGs), interactive visualizations (mapbox/maplibre), high-throughput analysis and app deployments (e.g. docker, kubernetes, GitHub Actions/GH Package Registry).
We will explore essential and emerging components of LLM coding ecosystem, including open/local models, retrieval-augmented generation (RAG), structured data, tool use, Model Context Protocol (MCP) servers and clients. In the process we will examine and compare the energy use and other environmental footprints of various LLM models, as well as other considerations for responsible AI including security, privacy, reliability, and bias.
Course Materials
- Syllabus
- Compute Setup
- Schedule (coming soon)
Previous Years
- Spring 2025 Archive - Introduction to Python, LLMs in data science, git/GitHub, relational databases, and geospatial data