This Matlab tutorial demonstrates step by step the Singular Spectrum Analysis (SSA).
Set general Parameters
M = 30; % window length = embedding dimension N = 200; % length of generated time series T = 22; % period length of sine function stdnoise = 1; % noise-to-signal ratio
Create time series X
First of all, we generate a time series, a sine function of length N with observational white noise
t = (1:N)'; X = sin(2*pi*t/T); noise = stdnoise*randn(size(X)); X = X + noise; X = X - mean(X); % remove mean value X = X/std(X,1); % normalize to standard deviation 1 figure(1); set(gcf,'name','Time series X'); clf; plot(t,X,'b-');
Calculate covariance matrix C (Toeplitz approach)
Next, we calculate the covariance matrix. There are several numerical approaches to estimate C. Here, we calculate the covariance function with CORR and build C with the function TOEPLITZ.
covX = xcorr(X,M-1,'unbiased'); Ctoep=toeplitz(covX(M:end)); figure(2); set(gcf,'name','Covariance matrix'); clf; imagesc(Ctoep); axis square set(gca,'clim',[-1 1]); colorbar
Calculate covariance matrix (trajectory approach)
An alternative approach is to determine C directly from the scalar product of Y, the time-delayed embedding of X. Although this estimation of C does not give a Toeplitz structure, with the eigenvectors not being symmetric or antisymmetric, it ensures a positive semi-definite covariance matrix.
Y=zeros(N-M+1,M); for m=1:M Y(:,m) = X((1:N-M+1)+m-1); end; Cemb=Y'*Y / (N-M+1); figure(2); set(gcf,'name','Covariance matrix'); clf; imagesc(Cemb); axis square set(gca,'clim',[-1 1]); colorbar
Choose covariance estimation
Choose between Toeplitz approach (cf. Vautard & Ghil) and trajectory approach (cf. Broomhead & King).
% C=Ctoep; C=Cemb;
Calculate eigenvalues LAMBDA and eigenvectors RHO
In order to determine the eigenvalues and eigenvectors of C, we use the function EIG. This function returns two matrices, the matrix RHO with eigenvectors arranged in columns, and the matrix LAMBDA with eigenvalues along the diagonal.
[RHO,LAMBDA] = eig(C); LAMBDA = diag(LAMBDA); % extract the diagonal elements [LAMBDA,ind]=sort(LAMBDA,'descend'); % sort eigenvalues RHO = RHO(:,ind); % and eigenvectors figure(3); set(gcf,'name','Eigenvectors RHO and eigenvalues LAMBDA') clf; subplot(3,1,1); plot(LAMBDA,'o-'); subplot(3,1,2); plot(RHO(:,1:2), '-'); legend('1', '2'); subplot(3,1,3); plot(RHO(:,3:4), '-'); legend('3', '4');
Calculate principal components PC
The principal components are given as the scalar product between Y, the time-delayed embedding of X, and the eigenvectors RHO.
PC = Y*RHO; figure(4); set(gcf,'name','Principal components PCs') clf; for m=1:4 subplot(4,1,m); plot(t(1:N-M+1),PC(:,m),'k-'); ylabel(sprintf('PC %d',m)); ylim([-10 10]); end;
Calculate reconstructed components RC
In order to determine the reconstructed components RC, we have to invert the projecting PC = Y*RHO; i.e. RC = Y*RHO*RHO'=PC*RHO'. Averaging along anti-diagonals gives the RCs for the original input X.
RC=zeros(N,M); for m=1:M buf=PC(:,m)*RHO(:,m)'; % invert projection buf=buf(end:-1:1,:); for n=1:N % anti-diagonal averaging RC(n,m)=mean( diag(buf,-(N-M+1)+n) ); end end; figure(5); set(gcf,'name','Reconstructed components RCs') clf; for m=1:4 subplot(4,1,m); plot(t,RC(:,m),'r-'); ylabel(sprintf('RC %d',m)); ylim([-1 1]); end;
Compare reconstruction and original time series
Note that the original time series X can be completely reconstructed by the sum of all reconstructed components RC (upper panel). The sine function can be reconstructed with the first pair of RCs (lower panel).
figure(6); set(gcf,'name','Original time series X and reconstruction RC') clf; subplot(2,1,1) plot(t,X,'b-',t,sum(RC(:,:),2),'r-'); legend('Original','Complete reconstruction'); subplot(2,1,2) plot(t,X,'b','LineWidth',2); plot(t,X,'b-',t,sum(RC(:,1:2),2),'r-'); legend('Original','Reconstruction with RCs 1-2');