Functional Data Analysis

Channels

When

September 4, 11, 18, 25 — Thursdays at 10:30 · m1p.org/go_zoom
October (most likely) — Saturdays at 10:30

Foundation models for spatial-time series

Foundation AI models are universal models for a wide set of problems. This project investigates their theoretical properties on spatial-time series—data used across sciences to generalize knowledge and make forecasts. Core user-level tasks: forecasting and generation of time series; analysis and classification; change-point detection; causal inference. These models are trained on massive datasets. Our goal is to compare architectures to find an optimal one that solves the above for a broad range of spatial time series.

Functional data analysis

We assume continuous time and study state-space changes $\frac{d\mathbf{x}}{dt}$ via neural ODEs/SDEs. We analyze multivariate/multidimensional series with tensor representations; model strong cross-correlations in Riemannian spaces. Many medical series are periodic; the base model is the pendulum $\frac{d^2 x}{dt^2} = -c\sin x$. We use physics-informed neural networks (PINNs). Practical experiments involve multiple sources; we use canonical correlation analysis with a latent state space to align source/target manifolds and enable generation in both.

Applications

Any field with continuous time/space data from multimodal sources: climate, neural interfaces, solid-state physics, electronics, fluid dynamics, and more. We collect both theory and practice.

Fall 2025: Foundation models for time series

Topics to discuss

State Space Models, Convolution, SSA, SSM (Spectral Submanifolds)
Neural & Controlled ODE, Neural PDE, Geometric Learning
Operator Learning, Physics-informed learning, multimodeling
Spatio-Temporal Graph Modeling: graph convolution & metric tensors
Riemannian models; time series generation
AI for science: mathematical modeling principles

Outside the course: data-driven tensor analysis, differential forms, spinors.

State of the Art in 2025

In December 2024, a NeurIPS workshop “Foundational models for science” reflected this theme:

Foundation Models for Science: Progress, Opportunities, and Challenges — URL
Foundation Models for the Earth system — UPL, no paper
Foundation Methods for foundation models for scientific machine learning — URL, no paper
AI-Augmented Climate simulators and emulators — URL, no paper
Provable in-context learning of linear systems and linear elliptic PDEs with transformers — NIPS
VSMNO: Solving PDE by Utilizing Spectral Patterns of Different Neural Operators — NIPS PDF

March 2025 Physics Problem Simulations

The Well: a Large-Scale Collection of Diverse Physics Simulations for ML — arXiv · Code
Polymathic: Advancing Science through Multi-Disciplinary AI — blog
Long Term Memory: The Foundation of AI Self-Evolution — arXiv
Distilling Free-Form Natural Laws from Experimental Data (2009) — Science · comment · medium
Deep learning for universal linear embeddings of nonlinear dynamics — Nature
A comparison of data-driven approaches to low-dimensional ocean models (2021) — arXiv
Applications of DL to Ocean Data Inference & Subgrid Parameterization (2018) — preprint
On energy-aware hybrid models (2024) — doi

Spatial-Temporal Graph Modeling

Graph WaveNet — arXiv
Diffusion Convolutional Recurrent Neural Network (DCRNN) — ICLR
Time-SSM: Simplifying & Unifying State Space Models — arXiv
State Space Reconstruction for Multivariate Time Series — arXiv
Longitudinal predictive modeling of tau progression — NeuroImage 2021

Work arrangements

Week	Date	Theme	Delivery
1	Sep 4	Preliminary discussion — pdf
2	Sep 11	Problem statement — pdf
3	Sep 18	Preliminary solution	Group talk & discussion
4	Sep 25	Minimum deployment	Group report
5	Oct 4+	FDA	Personal talks
13	Nov 29	Final discussion	Group talks

Structure of seminars

The semester lasts 12 weeks; six alternate weeks are for homework.

Odd week: topic intro + homework theme handout.
Every week: essay discussion; collect improvement list.
Odd week: discuss improved essays; integrate into a joint structure.

Scoring

Group activity: cross-ranking with Kemeny median. Personal talks contribute to score.

Week 3 — Homework 1

Form a group.
Discuss goals and a solution ([see the problem statement]).
Review solution approaches.
Select an LLM-GPT.
Run the code; verify it works.
- Store code in the group repository.
- Store slides/report as well.
Make a 10-minute talk covering:
- Functionality and architecture of the model.
- Why you selected this model.
- Alternative models considered.

Requirements for the text & discussion

Comprehensive explanation of the discussed method/question.
Principles only; no experiments.
~Two pages.
Target reader: 2nd–3rd-year student.
One figure is mandatory.
Brief reference to DL structure is welcome.
Talk may be a slide or the text itself.
References with DOIs.
State how it was generated.
Note observed gaps to revisit later.

Style remarks for the essays

Automatic text generation raises the bar for clarity and authorship. Use generative tools to train persuasion skills; write for a thesis defense committee.