Multilevel regression modeling of nonlinear processes: Derivation and applications to climatic variability


Kravtsov, S, Dmitri Kondrashov, and M Ghil. 2005. “Multilevel regression modeling of nonlinear processes: Derivation and applications to climatic variability.” Journal of Climate 18 (21): 4404–4424.
PDF1.38 MB


Predictive models are constructed to best describe an observed field’s statistics within a given class of nonlinear dynamics driven by a spatially coherent noise that is white in time. For linear dynamics, such inverse stochastic models are obtained by multiple linear regression (MLR). Nonlinear dynamics, when more appropriate, is accommodated by applying multiple polynomial regression (MPR) instead; the resulting model uses polynomial predictors, but the dependence on the regression parameters is linear in both MPR and MLR. The basic concepts are illustrated using the Lorenz convection model, the classical double-well problem, and a three-well problem in two space dimensions. Given a data sample that is long enough, MPR successfully reconstructs the model coefficients in the former two cases, while the resulting inverse model captures the three-regime structure of the system’s probability density function (PDF) in the latter case. A novel multilevel generalization of the classic regression procedure is introduced next. In this generalization, the residual stochastic forcing at a given level is subsequently modeled as a function of variables at this level and all the preceding ones. The number of levels is determined so that the lag-0 covariance of the residual forcing converges to a constant matrix, while its lag-1 covariance vanishes. This method has been applied to the output of a three-layer, quasigeostrophic model and to the analysis of Northern Hemisphere wintertime geopotential height anomalies. In both cases, the inverse model simulations reproduce well the multiregime structure of the PDF constructed in the subspace spanned by the dataset’s leading empirical orthogonal functions, as well as the detailed spectrum of the dataset’s temporal evolution. These encouraging results are interpreted in terms of the modeled low-frequency flow’s feedback on the statistics of the subgrid-scale processes.

Last updated on 10/15/2016