Deep statistics: AI and earth observations for sustainable development
Stat 288Teaching fellow, Graduate course, Harvard University, Department of Statistics
Taught: 2022 Spring, 2023 Spring
Description
With the aim to enhance concomitantly the rigor and efficiency of data science for scientific inquires, deep statistics emphasizes principled systems thinking throughout the entire data science ecosystem, from data conception to their postmortem examination for scientific reproducibility and replicability. This course introduces a trinity of deep statistics of, for and by multi-source, multi-phase, and multi-resolution statistical learning, and invites research participations on their implications and implementations in the context of AI and Earth Observations (EO) for sustainable development (e.g., global poverty and health). Theoretically, the course contemplates many trade-offs for ‘data science for science’: data quality vs. quantity, data privacy vs. utility, statistical vs. computational efficiencies, inferential robustness vs. relevance, etc. Practically, it scrutinizes issues such as conceptualizing and collecting complex socioeconomic data, handling messy survey and satellite data, assessing uncertainties with black-box learning, and contemplating causal implications from AI-EO data. High-level methodological overviews of topics such as survey design, differential privacy, multiple imputation, bootstraps, and deep learning, will be provided on an as-needed basis.
New course in 2022.
