Statistical methods

2015
Kondrashov, Dmitri, Mickaël D. Chekroun, and Michael Ghil. “Data-driven non-Markovian closure models.” Physica D: Nonlinear Phenomena 297 (2015): 33–55. Abstract

This paper has two interrelated foci: (i) obtaining stable and efficient data-driven closure models by using a multivariate time series of partial observations from a large-dimensional system; and (ii) comparing these closure models with the optimal closures predicted by the Mori–Zwanzig (MZ) formalism of statistical physics. Multilayer stochastic models (MSMs) are introduced as both a generalization and a time-continuous limit of existing multilevel, regression-based approaches to closure in a data-driven setting; these approaches include empirical model reduction (EMR), as well as more recent multi-layer modeling. It is shown that the multilayer structure of MSMs can provide a natural Markov approximation to the generalized Langevin equation (GLE) of the MZ formalism. A simple correlation-based stopping criterion for an EMR–MSM model is derived to assess how well it approximates the GLE solution. Sufficient conditions are derived on the structure of the nonlinear cross-interactions between the constitutive layers of a given MSM to guarantee the existence of a global random attractor. This existence ensures that no blow-up can occur for a broad class of MSM applications, a class that includes non-polynomial predictors and nonlinearities that do not necessarily preserve quadratic energy invariants. The EMR–MSM methodology is first applied to a conceptual, nonlinear, stochastic climate model of coupled slow and fast variables, in which only slow variables are observed. It is shown that the resulting closure model with energy-conserving nonlinearities efficiently captures the main statistical features of the slow variables, even when there is no formal scale separation and the fast variables are quite energetic. Second, an MSM is shown to successfully reproduce the statistics of a partially observed, generalized Lotka–Volterra model of population dynamics in its chaotic regime. The challenges here include the rarity of strange attractors in the model’s parameter space and the existence of multiple attractor basins with fractal boundaries. The positivity constraint on the solutions’ components replaces here the quadratic-energy–preserving constraint of fluid-flow problems and it successfully prevents blow-up.

Groth, Andreas, Patrice Dumas, Michael Ghil, and Stéphane Hallegatte. “Impacts of natural disasters on a dynamic economy.” In Extreme Events : Observations, Modeling, and Economics, edited by Eric Chavez, Michael Ghil, and Jaime Urrutia-Fucugauchi, 343–360. American Geophysical Union and Wiley-Blackwell, 2015. Abstract

This chapter presents a modeling framework for macroeconomic growth dynamics; it is motivated by recent attempts to formulate and study “integrated models” of the coupling between natural and socioeconomic phe­ nomena. The challenge is to describe the interfaces between human activities and the functioning of the earth system. We examine the way in which this interface works in the presence of endogenous business cycle dynam­ ics, based on a nonequilibrium dynamic model. Recent findings about the macroeconomic response to natural disasters in such a nonequilibrium setting have shown a more severe response to natural disasters during expan­ sions than during recessions. These findings raise questions about the assessment of climate change damages or natural disaster losses that are based purely on long-term growth models. In order to compare the theoretical findings with observational data, we analyze cyclic behavior in the U.S. economy, based on multivariate singular spectrum analysis. We analyze a total of nine aggregate indicators in a 52 year interval (1954–2005) and demon­ strate that the behavior of the U.S. economy changes significantly between intervals of growth and recession, with higher volatility during expansions.

Groth, Andreas, and Michael Ghil. “Monte Carlo Singular Spectrum Analysis (SSA) revisited: Detecting oscillator clusters in multivariate datasets.” Journal of Climate 28, no. 19 (2015): 7873–7893. Abstract

Singular spectrum analysis (SSA) along with its multivariate extension (M-SSA) provides an efficient way to identify weak oscillatory behavior in high-dimensional data. To prevent the misinterpretation of stochastic fluctuations in short time series as oscillations, Monte Carlo (MC)–type hypothesis tests provide objective criteria for the statistical significance of the oscillatory behavior. Procrustes target rotation is introduced here as a key method for refining previously available MC tests. The proposed modification helps reduce the risk of type-I errors, and it is shown to improve the test’s discriminating power. The reliability of the proposed methodology is examined in an idealized setting for a cluster of harmonic oscillators immersed in red noise. Furthermore, the common method of data compression into a few leading principal components, prior to M-SSA, is reexamined, and its possibly negative effects are discussed. Finally, the generalized Procrustes test is applied to the analysis of interannual variability in the North Atlantic’s sea surface temperature and sea level pressure fields. The results of this analysis provide further evidence for shared mechanisms of variability between the Gulf Stream and the North Atlantic Oscillation in the interannual frequency band.

Groth, Andreas, Michael Ghil, Stéphane Hallegatte, and Patrice Dumas. “The Role of Oscillatory Modes in U.S. Business Cycles.” OECD Journal: Journal of Business Cycle Measurement and Analysis, no. 2015/1 (2015): 63–81. Abstract

We apply multivariate singular spectrum analysis to the study of U.S. business cycle dynamics. This method provides a robust way to identify and reconstruct oscillations, whether intermittent or modulated. We show such oscillations to be associated with comovements across the entire economy. The problem of spurious cycles generated by the use of detrending filters is addressed and we present a Monte Carlo test to extract significant oscillations. The behavior of the U.S. economy is shown to change significantly from one phase of the business cycle to another: the recession phase is dominated by a five-year mode, while the expansion phase exhibits more complex dynamics, with higher-frequency modes coming into play. We show that the variations so identified cannot be generated by random shocks alone, as assumed in ‘real’ business-cycle models, and that endogenous, deterministically generated variability has to be involved.

2014
Kondrashov, Dmitri, R. Denton, Y. Y. Shprits, and H. J. Singer. “Reconstruction of gaps in the past history of solar wind parameters.” Geophysical Research Letters 41, no. 8 (2014): 2702–2707. Publisher's Version
Groth, Andreas. “Interannual variability in the North Atlantic SST and wind forcing.” Seminar at International Research Institute for Climate and Society, Columbia, 2014. Abstract

Groth, Andreas. “Oscillatory behavior and oscillatory modes.” SSA workshop Bournemouth, September 2014, 2014. Abstract

2013
Sella, Lisa, Gianna Vivaldo, Andreas Groth, and Michael Ghil. “Economic Cycles and their Synchronization: A spectral survey.” Fondazione Eni Enrico Mattei (FEEM) 105, no. 105 (2013): 1. Publisher's Version Abstract

The present work applies several advanced spectral methods to the analysis of macroeconomic fluctuations in three countries of the European Union: Italy, The Netherlands, and the United Kingdom. We focus here in particular on singular-spectrum analysis (SSA), which provides valuable spatial and frequency information of multivariate data and that goes far beyond a pure analysis in the time domain. The spectral methods discussed here are well established in the geosciences and life sciences, but not yet widespread in quantitative economics. In particular, they enable one to identify and describe nonlinear trends and dominant cycles –- including seasonal and interannual components –- that characterize the deterministic behavior of each time series. These tools have already proven their robustness in the application on short and noisy data, and we demonstrate their usefulness in the analysis of the macroeconomic indicators of these three countries. We explore several fundamental indicators of the countries' real aggregate economy in a univariate, as well as a multivariate setting. Starting with individual single-channel analysis, we are able to identify similar spectral components among the analyzed indicators. Next, we consider combinations of indicators and countries, in order to take different effects of comovements into account. Since business cycles are cross-national phenomena, which show common characteristics across countries, our aim is to uncover hidden global behavior across the European economies. Results are compared with previous findings on the U.S. indicators \citepGroth.ea.FEEM.2012. Finally, the analysis is extended to include several indicators from the U.S. economy, in order to examine its influence on the European market.

Kondrashov, Dmitri, Mickaël D. Chekroun, Andrew W. Robertson, and Michael Ghil. “Low-order stochastic model and past-noise forecasting" of the Madden-Julian oscillation.” Geophysical Research Letters 40 (2013): 5305–5310.
Feliks, Yizhak, Andreas Groth, Andrew W. Robertson, and Michael Ghil. “Oscillatory Climate Modes in the Indian Monsoon, North Atlantic and Tropical Pacific.” Journal of Climate 26 (2013): 9528-–9544. Abstract

This paper explores the three-way interactions between the Indian monsoon, the North Atlantic and the Tropical Pacific. Four climate records were analyzed: the monsoon rainfall in two Indian regions, the Southern Oscillation Index for the Tropical Pacific, and the NAO index for the North Atlantic. The individual records exhibit highly significant oscillatory modes with spectral peaks at 7–8 yr and in the quasi-biennial and quasi-quadrennial bands. The interactions between the three regions were investigated in the light of the synchronization theory of chaotic oscillators. The theory was applied here by combining multichannel singular-spectrum analysis (M-SSA) with a recently introduced varimax rotation of the M-SSA eigenvectors. A key result is that the 7–8-yr and 2.7-yr oscillatory modes in all three regions are synchronized, at least in part. The energy-ratio analysis, as well as time-lag results, suggest that the NAO plays a leading role in the 7–8-yr mode. It was found therewith that the South Asian monsoon is not slaved to forcing from the equatorial Pacific, although it does interact strongly with it. The time-lag analysis pinpointed this to be the case in particular for the quasi-biennial oscillatory modes. Overall, these results confirm that the approach of synchronized oscillators, combined with varimax-rotated M-SSA, is a powerful tool in studying teleconnections between regional climate modes and that it helps identify the mechanisms that operate in various frequency bands. This approach should be readily applicable to ocean modes of variability and to the problems of air-sea interaction as well.

2012
Groth, Andreas, Michael Ghil, Stéphane Hallegatte, and Patrice Dumas. “The Role of Oscillatory Modes in U.S. Business Cycles.” Fondazione Eni Enrico Mattei (FEEM) 26 (2012): 1. Publisher's Version Abstract

We apply the advanced time-and-frequency-domain method of singular spectrum analysis to study business cycle dynamics in a set of nine U.S. macroeconomic indicators. This method provides a robust way to identify and reconstruct shared oscillations, whether intermittent or modulated. We address the problem of spurious cycles generated by the use of detrending filters and present a Monte Carlo test to extract significant oscillations. Finally, we demonstrate that the behavior of the U.S. economy changes significantly between episodes of growth and recession; these variations cannot be generated by random shocks alone, in the absence of endogenous variability.

2011
Kravtsov, Sergey, Dmitri Kondrashov, I. Kamenkovich, and Michael Ghil. “An empirical stochastic model of sea-surface temperatures and surface winds over the Southern Ocean.” Ocean Science 7, no. 6 (2011): 755–770. Publisher's Version Abstract

This study employs NASA's recent satellite measurements of sea-surface temperatures (SSTs) and sea-level winds (SLWs) with missing data filled-in by Singular Spectrum Analysis (SSA), to construct empirical models that capture both intrinsic and SST-dependent aspects of SLW variability. The model construction methodology uses a number of algorithmic innovations that are essential in providing stable estimates of the model's propagator. The best model tested herein is able to faithfully represent the time scales and spatial patterns of anomalies associated with a number of distinct processes. These processes range from the daily synoptic variability to interannual signals presumably associated with oceanic or coupled dynamics. Comparing the simulations of an SLW model forced by the observed SST anomalies with the simulations of an SLW-only model provides preliminary evidence for the ocean driving the atmosphere in the Southern Ocean region.

Groth, Andreas, and Michael Ghil. “Multivariate singular spectrum analysis and the road to phase synchronization.” Physical Review E 84 (2011): 036206. Abstract

We show that multivariate singular spectrum analysis (M-SSA) greatly helps study phase synchronization in a large system of coupled oscillators and in the presence of high observational noise levels. With no need for detailed knowledge of individual subsystems nor any a priori phase de?nition for each of them, we demonstrate that M-SSA can automatically identify multiple oscillatory modes and detect whether these modes are shared by clusters of phase- and frequency-locked oscillators. As an essential modi?cation of M-SSA, here we introduce variance-maximization (varimax) rotation of the M-SSA eigenvectors to optimally identify synchronized-oscillator clustering.

2010
Kravtsov, Sergey, Dmitri Kondrashov, and Michael Ghil. “Empirical model reduction and the modelling hierarchy in climate dynamics and the geosciences.” In Stochastic physics and climate modeling. Cambridge University Press, Cambridge, edited by P. Williams and T. Palmer, 35–72. Cambridge University Press, 2010.
Kondrashov, Dmitri, Yuri Shprits, and Michael Ghil. “Gap Filling of Solar Wind Data by Singular Spectrum Analysis.” Geophysical Research Letters 37 (2010): L15101. Abstract

Observational data sets in space physics often contain instrumental and sampling errors, as well as large gaps. This is both an obstacle and an incentive for research, since continuous data sets are typically needed for model formulation and validation. For example, the latest global empirical models of Earth's magnetic field are crucial for many space weather applications, and require time continuous solar wind and interplanetary magnetic field (IMF) data; both of these data sets have large gaps before 1994. Singular spectrum analysis (SSA) reconstructs missing data by using an iteratively inferred, smooth “signal” that captures coherent modes, while “noise” is discarded. In this study, we apply SSA to fill in large gaps in solar wind and IMF data, by combining it with geomagnetic indices that are time continuous, and generalizing it to multivariate geophysical data consisting of gappy “driver” and continuous “response” records. The reconstruction error estimates provide information on the physics of co variability between particular solar wind parameters and geomagnetic indices.

Feliks, Yizhak, Michael Ghil, and Andrew W. Robertson. “Oscillatory Climate Modes in the Eastern Mediterranean and Their Synchronization with the North Atlantic Oscillation.” Journal of Climate 23, no. 15 (2010): 4060–4079. Abstract

Oscillatory climatic modes over the North Atlantic, Ethiopian Plateau, and eastern Mediterranean were examined in instrumental and proxy records from these regions. Aside from the well-known North Atlantic Oscillation (NAO) index and the Nile River water-level records, the authors study for the first time an instrumental rainfall record from Jerusalem and a tree-ring record from the Golan Heights. The teleconnections between the regions were studied in terms of synchronization of chaotic oscillators. Standard methods for studying synchronization among such oscillators are modified by combining them with advanced spectral methods, including singular spectrum analysis. The resulting cross-spectral analysis quantifies the strength of the coupling together with the degree of synchronization. A prominent oscillatory mode with a 7–8-yr period is present in all the climatic indices studied here and is completely synchronized with the North Atlantic Oscillation. An energy analysis of the synchronization raises the possibility that this mode originates in the North Atlantic. Evidence is discussed for this mode being induced by the 7–8-yr oscillation in the position of the Gulf Stream front. A mechanism for the teleconnections between the North Atlantic, Ethiopian Plateau, and eastern Mediterranean is proposed, and implications for interannual-to-decadal climate prediction are discussed.

Strounine, K., Sergey Kravtsov, Dmitri Kondrashov, and Michael Ghil. “Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance.” Physica D: Nonlinear Phenomena 239, no. 3 (2010): 145–166. Abstract

Low-frequency variability (LFV) of the atmosphere refers to its behavior on time scales of 10–100 days, longer than the life cycle of a mid-latitude cyclone but shorter than a season. This behavior is still poorly understood and hard to predict. The present study compares various model reduction strategies that help in deriving simplified models of LFV. Three distinct strategies are applied here to reduce a fairly realistic, high-dimensional, quasi-geostrophic, 3-level (QG3) atmospheric model to lower dimensions: (i) an empirical–dynamical method, which retains only a few components in the projection of the full QG3 model equations onto a specified basis, and finds the linear deterministic and the stochastic corrections empirically as in Selten (1995) [5]; (ii) a purely dynamics-based technique, employing the stochastic mode reduction strategy of Majda et al. (2001) [62]; and (iii) a purely empirical, multi-level regression procedure, which specifies the functional form of the reduced model and finds the model coefficients by multiple polynomial regression as in Kravtsov et al. (2005) [3]. The empirical–dynamical and dynamical reduced models were further improved by sequential parameter estimation and benchmarked against multi-level regression models; the extended Kalman filter was used for the parameter estimation. Overall, the reduced models perform better when more statistical information is used in the model construction. Thus, the purely empirical stochastic models with quadratic nonlinearity and additive noise reproduce very well the linear properties of the full QG3 model’s LFV, i.e. its autocorrelations and spectra, as well as the nonlinear properties, i.e. the persistent flow regimes that induce non-Gaussian features in the model’s probability density function. The empirical–dynamical models capture the basic statistical properties of the full model’s LFV, such as the variance and integral correlation time scales of the leading LFV modes, as well as some of the regime behavior features, but fail to reproduce the detailed structure of autocorrelations and distort the statistics of the regimes. Dynamical models that use data assimilation corrections do capture the linear statistics to a degree comparable with that of empirical–dynamical models, but do much less well on the full QG3 model’s nonlinear dynamics. These results are discussed in terms of their implications for a better understanding and prediction of LFV.

2009
Kravtsov, Sergey, Dmitri Kondrashov, and Michael Ghil. “Empirical model reduction and the modelling hierarchy in climate dynamics and the geosciences.” Stochastic physics and climate modelling. Cambridge University Press, Cambridge (2009): 35–72. Abstract
Modern climate dynamics uses a two-fisted approach in attacking and solving the problems of atmospheric and oceanic flows. The two fists are: (i) observational analyses; and (ii) simulations of the geofluids, including the coupled atmosphere–ocean system, using a hierarchy of dynamical models. These models represent interactions between many processes that act on a broad range of spatial and time scales, from a few to tens of thousands of kilometers, and from diurnal to multidecadal, respectively. The evolution of virtual climates simulated by the most detailed and realistic models in the hierarchy is typically as difficult to interpret as that of the actual climate system, based on the available observations thereof. Highly simplified models of weather and climate, though, help gain a deeper understanding of a few isolated processes, as well as giving clues on how the interaction between these processes and the rest of the climate system may participate in shaping climate variability. Finally, models of intermediate complexity, which resolve well a subset of the climate system and parameterise the remainder of the processes or scales of motion, serve as a conduit between the models at the two ends of the hierarchy. We present here a methodology for constructing intermediate mod- els based almost entirely on the observed evolution of selected climate fields, without reference to dynamical equations that may govern this evolution; these models parameterise unresolved processes as multi- variate stochastic forcing. This methodology may be applied with equal success to actual observational data sets, as well as to data sets resulting from a high-end model simulation. We illustrate this methodology by its applications to: (i) observed and simulated low-frequency variability of atmospheric flows in the Northern Hemisphere; (ii) observed evo- lution of tropical sea-surface temperatures; and (iii) observed air–sea interaction in the Southern Ocean. Similar results have been obtained for (iv) radial-diffusion model simulations of Earth’s radiation belts, but are not included here because of space restrictions. In each case, the reduced stochastic model represents surprisingly well a variety of linear and nonlinear statistical properties of the resolved fields. Our methodology thus provides an efficient means of constructing reduced, numerically inexpensive climate models. These models can be thought of as stochastic–dynamic prototypes of more complex deterministic models, as in examples (i) and (iv), but work just as well in the situation when the actual governing equations are poorly known, as in (ii) and (iii). These models can serve as competitive prediction tools, as in (ii), or be included as stochastic parameterisations of certain processes within more complex climate models, as in (iii). Finally, the methodology can be applied, with some modifications, to geophysical problems outside climate dynamics, as illustrated by (iv).
2008
Camargo, Suzana J., Andrew W. Robertson, Anthony G. Barnston, and Michael Ghil. “Clustering of eastern North Pacific tropical cyclone tracks: ENSO and MJO effects.” Geochemistry, Geophysics, Geosystems 9, no. 6 (2008).
Kravtsov, Sergey, W. K. Dewar, P. Berloff, J. C. McWilliams, and Michael Ghil. “North Atlantic climate variability in coupled models and data.” Nonlinear Processes in Geophysics 15 (2008): 13–24.