logo


Spring 2026

Logistics â„šī¸
👤 Instructor Carl Boettiger
🏛 Location Room 166 - Social Sciences Building
📅 Days Monday / Wednesday
⏰ Time 10:00 - 11:30

Overview

Welcome to ESPM-288: Reproducible & Collaborative Data Science.

This Spring 2026 semester, we explore the Agent-First Methodology for environmental data science. We will leverage the R ecosystem (including duckdbfs, tidyverse, mapgl, gdalcubes ellmer, vitals, mcptools, ragnar, & shiny) to architect, audit, and deploy robust data solutions for complex environmental problems. We will focus on high-performance pipelines for larger-than-RAM data (e.g. parquet/geoparquet, COGs), interactive visualizations (mapbox/maplibre), high-throughput analysis and app deployments (e.g. docker, kubernetes, GitHub Actions/GH Package Registry).

We will explore essential and emerging components of LLM coding ecosystem, including open/local models, retrieval-augmented generation (RAG), structured data, tool use, Model Context Protocol (MCP) servers and clients. In the process we will examine and compare the energy use and other environmental footprints of various LLM models, as well as other considerations for responsible AI including security, privacy, reliability, and bias.

Course Materials


Previous Years

  • Spring 2025 Archive - Introduction to Python, LLMs in data science, git/GitHub, relational databases, and geospatial data