Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
233 views
in Technique[技术] by (71.8m points)

pca - Component reconstruction for multivariate lagged time series

I am trying to write a multivariate Singular Spectrum Analysis with Monte Carlo test. To this extent I am working on a code piece that can reconstruct the input series using the lagged trajectory matrix and projection base (ST-PCs) that result from the pca/ssa decomposition of the input series. The attached code piece works for a lagged univariate (that is, single) time series, but I am struggling to make this reconstruction for a lagged multivariate time series. I don't quite get the procedure mathematically and - not surprisingly - I also did not manage to program it. Useful links are attached to the function descriptions of the accompanying code. Input data should be of the form (time * number of series), so say 288x3 implying 3 time series of 288 time levels.

I hope you can help me out!

import numpy as np

def lagged_covariance_matrix(data, M):
    """ Computes the lagged covariance matrix using the Broomhead & King method 
    
    Background: Plaut, G., & Vautard, R. (1994). Spells of low-frequency oscillations and 
    weather regimes in the Northern Hemisphere. Journal of the atmospheric sciences, 51(2), 210-236.
    
    Arguments:
        data : pxn time series, where p denotes the length of the time series and n the number of channels 
        M : window length """

    # explicitely 'add' spatial dimension if input is a single time series    
    if np.ndim(data) == 1:
        data = np.reshape(data,(len(data),1))
    
    T = data.shape[0]    
    L = data.shape[1]    
    N = T - M + 1         
    
    X = np.zeros((T, L, M))
    
    for i in range(M):
        X[:,:,i] = np.roll(data, -i, axis = 0)
    
    X = X[:N]
    
    # X constitutes the trajectory matrix and is a stacked hankel matrix
    X = np.reshape(X, (N, M*L), order = 'C') # https://www.jstatsoft.org/article/viewFile/v067i02/v67i02.pdf
    
    # choose the smallest projection basis for computation of the covariance matrix    
    if M*L >= N:        
        return 1/(M*L) * X.dot(X.T), X
    
    else:
        return 1/N * X.T.dot(X), X
    
def sort_by_eigenvalues(eigenvalues, PCs): 
    """ Sorts the PCs and eigenvalues by descending size of the eigenvalues """
    
    desc = np.argsort(-eigenvalues)
    
    return eigenvalues[desc], PCs[:,desc]

def Reconstruction(M, E, X):
    """ Reconstructs the series as the sum of M subseries.
    
    See: https://en.wikipedia.org/wiki/Singular_spectrum_analysis, 'Basic SSA' &
    the work of Vivien Sainte Fare Garnot on univariate time series (https://github.com/VSainteuf/mcssa)
    
    Arguments: 
        M : window length 
        E : eigenvector basis 
        X : trajectory matrix """
       
    time = len(X) + M - 1
    RC = np.zeros((time, M))
       
    # step 3: grouping
    for i in range(M):
        d = np.zeros(M)
        d[i] = 1
        I = np.diag(d)
        
        Q = np.flipud(X @ E @ I @ E.T)
        
        # step 4: diagonal averaging        
        for k in range(time):
            RC[k, i] = np.diagonal(Q, offset = -(time - M - k)).mean()

    return RC 

#=====================================================================================================
#=====================================================================================================
#=====================================================================================================

# input data
data = None

# number of lags a.k.a. window length
M = 45 # M = 1 means no lag  

covmat, X = lagged_covariance_matrix(data, M)        

# get the eigenvalues and vectors of the covariance matrix
vals, vecs = np.linalg.eig(covmat)
eig_data, eigvec_data = sort_by_eigenvalues(vals, vecs)

# component reconstruction
recons_data = Reconstruction(M, eigvec_data, X)
question from:https://stackoverflow.com/questions/65644051/component-reconstruction-for-multivariate-lagged-time-series

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The following works but does not make direct use of the projection base (ST-PCs). Hence the original question still stands, but this already helps a great lot and solves the problem for me. This code piece makes use of the similarity between the ST-PCs projection base and the u & vt matrices obtained from the single value decomposition of the lagged trajectory matrix. I think it gives back the same answer as one would obtain using the ST-PCs projection base?

def lag_reconstruction(data, X, M, pairs = None):
    """ Reconstructs the series as the sum of M subseries using the lagged trajectory matrix. 
        Based on equation 2.9 of Plaut, G., & Vautard, R. (1994). Spells of low-frequency oscillations and weather regimes in the Northern Hemisphere. Journal of Atmospheric Sciences, 51(2), 210-236.
        Inspired by work of R. van Westen and C. Wieners """    

    time = data.shape[0]    # number of time levels of the original series
    L = data.shape[1]       # number of input series
    N = time - M + 1
    
    u, s, vt = np.linalg.svd(X, full_matrices = False)
    rc = np.zeros((time, L, M))
    
    for t in range(time): 
        
        counter = 0
        
        for i in range(M): 
                
            if t-i >= 0 and t-i < N:            
                counter += 1
                
                if pairs:
                    for k in pairs:
                        rc[t,:,i] += u[t-i, k] * s[k] * vt[k, i*L : i*L + L]
                else: 
                    for k in range(len(s)):  
                        rc[t,:,i] += u[t-i, k] * s[k] * vt[k, i*L : i*L + L]
    
        rc[t] = rc[t]/counter
               
    return rc

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...