Advanced Environmental Data Analysis
Graduate course | School of Earth and Atmospheric Sciences, Georgia Tech | Spring 2023
Course Philosophy and Goals
This course is an advanced introduction to environmental data analysis. It is intended for first year graduate students and senior undergraduates. The goal of this class is to provide a deeper understanding of the theory underlying the statistical analysis of environmental data, both in the space, time and spectral domain, and to provide the students with a hands on experience. Ideally at the end of this class you will have developed a series of computer programming tool boxes and theoretical skills that should immediately be available for analyzing and modeling data in your own research.
Although some previous knowledge of probability and statistics is required, a background review will be provided. Concepts and notation will be reintroduced as needed. In this class you will learn (A) how to combine models, which quantify statistical or dynamical relationships, with observations, (B) time series analysis, (C) forecasting and extrapolation and (D) signal decomposition. A more detail description of these topics is appended in the LECTURE TOPICS below.
Lecture Topics
- Background Review: Matrix and Vector Algebra, Fundamental Statistical Measures, Multivariable Probability Densities, Sample Estimates, Correlation and Covariance, Function and Sums of Random Variables, Central Limit Theorem.
- Combining models and observations: Interpolation and Function Fitting, Least Square modeling and Singular Vector Expansion, Uncertainties in Estimates, Inverse Methods, Statistical vs. Dynamical Constraints.
- Time Series Analysis: Time and Frequency Domain Models, Stationarity, Auto-Regression Models, Spectral Analysis and Coherence, Trend Analysis and Significance, Estimating errors in time series reconstruction.
- Forecasting and Extrapolation: Statistically Optimal Linear Estimators, Regression models, space and time models, objective mapping (multivariate regression), covariance modeling.
- Decomposing signals: Multivariate eigenfunction analysis, EOFs, PCA, CCA, and Wavelet analysis
Lecture Content
Background Review
Topic 1: Why is statistical analysis useful? [link]
Topic 2: An overview of the statistical methods. How does it all fit together? [link]
Topic 3: Fundamental Statistical Measures. Univariate Statistics and PDFs [link]
Topic 4: Fundamental Statistical Measures. Multivariate Statistics and JPDFs [link]
Topic 5: Statistically Optimal Linear Estimators: relationship between least squares and conditional joint PDF estimates [link]
Combining Models and Observations
Review of linear algebra [link]
Topic 6: Testing a model against observations: Introduction to Least Squares (LSQ) [link]
Topic 7: Interpolation and function fitting with LSQ: The CO2 curve and SST spatial maps [link]
Topic 8: LSQ and Inverse Modeling: Reconstructing the source of a pollutant with an advection diffusion model [link]
Topic 9: Lagrange Multipliers and Adjoints [link]
Forecasting and Extrapolation
Topic 10: Covariance Modeling; Basic Theory; [link]
Signal Decomposition
Topic 11: Empirical Orthogonal Functions (EOFs) / Principal Component Analysis (PCA) [link]
Topic 12: Covariance Model Space Example [Coding exercise]
Topic 13: Singular Value Decomposition (SVD) [Coding exercise]
Topic 14: Objective Mapping [link]