Skip to content

econometron.Models.VectorAutoReg.VARMA

  • class in the econometron.Models.VectorAutoReg module

Overview

A Vector Autoregressive Moving Average (VARMA) model captures dynamic relationships among multiple time series variables, combining both autoregressive (AR) and moving average (MA) components. The VARMA class in econometron.Models.timeseries allows specification, estimation, forecasting, and structural analysis of VARMA models for standard and advanced economic applications.

VARMA Model Specification

For endogenous variables , a VARMA(p, q) system is:

where:

  • are AR coefficient matrices capturing past value effects.
  • are MA coefficient matrices capturing the propagation of past shocks.
  • is a vector of innovations with covariance .

Example (VARMA(1,1))

  • Units: original data scale
  • Use case: multivariate forecasting and structural economic analysis

This implementation supports both standard and advanced use cases, with two primary modes:

Non-Structural Estimation: Fits a standard VARMA model, primarily for forecasting.

Structural Identification: Uses the Echelon (canonical) form with Kronecker indices to impose zero restrictions on AR and MA coefficient matrices, resulting in a more parsimonious and statistically identified model. This is essential for economic or policy analysis, e.g., understanding structural shocks.

Estimation Method:

  • Initial parameter estimates are computed using the Hannan-Rissanen method, which fits a high-order VAR to approximate residuals and generate starting values.
  • Full model parameters are then refined using Conditional Maximum Likelihood Estimation (CMLE), optionally enforcing stability and invertibility constraints.

Inheritance from VAR

The VARMA class extends VAR, adding MA dynamics and structural analysis:

  • Gains all VAR functionalities: data validation, stationarity tests, AR lag selection, fitting, forecasting, IRFs, and automated workflows (SbS and EL_RAPIDO).
  • Adds MA coefficient estimation, lag order selection for MA terms, and echelon-based structural identification.
  • Setting reduces VARMA to a standard VAR model, ensuring full compatibility with existing VAR workflows.

A Practical Guide to Echelon Form (Kronecker Indices)

When you have multiple time series, estimating all possible parameter interactions can be overkill. You might be estimating a bunch of coefficients that are actually zero, making your model unnecessarily complex.

The Echelon form, identified using Kronecker indices, solves this problem. It imposes zero-restrictions on the model's parameter matrices, resulting in a more parsimonious and statistically identified model.

Think of it like this: instead of assuming every variable’s past equally affects every other variable up to lag p, the algorithm determines a specific lag order for each variable in the system. This creates a custom-fit structure that only estimates the parameters that truly matter, which is essential for meaningful structural analysis.


In Practice

  • Use Non-Structural Estimation for general multivariate forecasting.
  • Use Structural Identification with echelon form to uncover causal relationships and policy-relevant shocks.
  • VARMA models richer dynamics than VAR, while remaining backward-compatible.

class VARMA capabilities

  • Fit standard and structural VARMA models
  • Automatic selection of AR and MA orders ()
  • Compute forecasts, simulations, and impulse response functions
  • Comprehensive diagnostics and model validation
  • Structural analysis via echelon form and Kronecker indices
  • Fully inherits VAR functionalities

Initialization

We start by creating a VARMA object with your time series data and configuration settings.

python
from econometron.Models.VectorAutoReg.VARMA import VARMA

varma_model=VARMA(data:pd.DataFrame , max_p:int=5, max_q:int=5, columns:list=None, forecast_h:int=6, plot:bool=True, check_stationarity:bool=True, bootstrap_n:int=1000, criterion:str='AIC', structural_id:bool=False, ci_alpha:int=0.05, Key:str=None, Threshold:int=0.8, orth:bool=False, enforce_stab_inver:bool=False):

Parameters

AttributeTypeDescription
datapd.DataFrameYour input time series data, with each column representing a variable.
max_pintThe maximum AR order (lags of the variables) to check during the automatic grid search. Default is 5.
max_qintThe maximum MA order (lags of the errors) to check during the automatic grid search. Default is 5.
columnslist, optionalA list of column names to use. If None, all numeric columns are included.
forecast_hintThe default number of steps to forecast ahead. Default is 6.
plotboolIf True, automatically generates plots for diagnostics, forecasts, etc. Default is True.
check_stationarityboolIf True, runs ADF and KPSS tests on your data when you initialize the class. Default is True.
bootstrap_nintThe number of bootstrap repetitions for generating confidence intervals for IRFs. Default is 1000.
criterionstrThe information criterion ('AIC', 'BIC', 'HQIC') for selecting the best (p,q) orders. Default is 'AIC'.
structural_idboolIf True, the model will be estimated in its structural (echelon) form using Kronecker indices. Default is False.
ci_alphafloatThe significance level for confidence intervals (e.g., 0.05 for a 95% CI). Default is 0.05.
Keystr, optionalThe operational mode.Not Yet Implemented
ThresholdfloatA score between 0 and 1. The model is considered "well-fitted" only if its final diagnostic score exceeds this value. Default is 0.8.
orthboolIf True, computes Orthogonalized Impulse Responses using a Cholesky decomposition for non-structural models. Default is False.
enforce_stab_inverboolIf True, forces the estimation process to find parameters that result in a stable and invertible model. Can be slow. Default is False.

Class Attributes

These attributes get filled in after you've fitted the model. They hold all the important results.

AttributeTypeDescription
best_modeldictA dictionary holding all results for the best-fitting model, including parameters, residuals, and criteria.
best_pintThe selected optimal AR order p.
best_qintThe selected optimal MA order q.
Kronindnp.ndarrayAn array of the estimated Kronecker indices if you ran a structural identification.
AR_s, MA_snp.ndarrayThe structural restriction "masks" (matrices of 0s and 1s) for the AR and MA parameters in echelon form.
Ph0np.ndarrayThe estimated contemporaneous impact matrix from a structural model, showing how shocks instantly affect variables.
fittedboolBecomes True if the final model passes the diagnostic checks based on the Threshold.

Methods

  • kron_index(lag: int)

    • Purpose: Computes the Kronecker indices for a multivariate time series to determine the structural order of a VARMA model. Uses Canonical Correlation Analysis (CCA) to identify the lag structure by testing the significance of correlations between past and future values.

    • Parameters:

      • lag (int): Maximum lag to consider for the Kronecker index computation.
    • Returns:

      • dict: Dictionary containing:

        • index (np.ndarray): Kronecker indices for each variable.
        • tests (list): Test statistics for each component, including test value, degrees of freedom, p-value, and dsq (if applicable).
    • Notes:

      • Assumes self.data is a pandas DataFrame with shape (T, K).
      • Uses CCA to assess the relationship between past and future values.
      • Applies a chi-squared test to determine significance, using self.ci_alpha as threshold.
      • Outputs progress and results for each component if significant.
  • struct_id(ord: Optional[int] = None, use_var: bool = False, output: bool = True)

    • Purpose: Identifies structural AR and MA coefficient matrices for a VARMA model based on Kronecker indices. Optionally uses VAR to determine the lag order if ord is not provided.

    • Parameters:

      • ord (int, optional): Specific lag order for the model (default: None).
      • use_var (bool, optional): If True, uses VAR to estimate the lag order (default: False).
      • output (bool, optional): If True, prints AR and MA coefficient matrices (default: True).
    • Returns:

      • dict: Dictionary containing:

        • AR_s_id (np.ndarray): Structural AR coefficient matrix.
        • MA_s_id (np.ndarray): Structural MA coefficient matrix.
    • Notes:

      • Sets self.best_p and self.best_q based on the maximum Kronecker index.
      • Constructs sparse AR and MA matrices based on Kronecker indices.
      • If use_var is True, estimates the lag order using a VAR model with a maximum lag based on the square root of sample size or self.max_p.
  • _ini_s1(ord: Optional[int] = None, output: bool = True, p: Optional[int] = None, q: Optional[int] = None)

    • Purpose: Initializes parameter estimates for a VARMA model using either structural identification or a non-structural approach via OLS on lagged variables and residuals.

    • Parameters:

      • ord (int, optional): Specific lag order for structural identification (default: None).
      • output (bool, optional): If True, prints estimation details (default: True).
      • p (int, optional): AR lag order for non-structural estimation (default: None).
      • q (int, optional): MA lag order for non-structural estimation (default: None).
    • Returns:

      • tuple: (estimates, standard_errors)

        • estimates (list or np.ndarray): Estimated parameters.
        • standard_errors (list or np.ndarray): Standard errors of the estimates.
    • Notes:

      • Uses struct_id results if self.structural_id is True.
      • For non-structural estimation, constructs regressors from lagged variables (AR) and residuals (MA).
      • Maximum lag order (max_pq) is determined based on sample size or user-specified max_p/max_q.
  • _prepare_for_est(estimates, stand_err, output: bool = True)

    • Purpose: Prepares parameter estimates and standard errors for optimization by flattening, filtering, and computing confidence bounds.

    • Parameters:

      • estimates (list or np.ndarray): Parameter estimates.
      • stand_err (list or np.ndarray): Standard errors of the estimates.
      • output (bool, optional): Prints a table of initial estimates, bounds, and standard errors (default: True).
    • Returns:

      • tuple: (par, separ, lowerBounds, upperBounds)

        • par (np.ndarray): Flattened parameter estimates.
        • separ (np.ndarray): Flattened standard errors.
        • lowerBounds (np.ndarray): Lower confidence bounds (par - 2 * separ).
        • upperBounds (np.ndarray): Upper confidence bounds (par + 2 * separ).
    • Notes:

      • Raises ValueError if estimates or standard errors are None or empty.
      • Concatenates lists for structural identification.
      • Prints formatted table if output is True.
  • prepare_A_B_Matrices(par, p: Optional[int] = None, q: Optional[int] = None)

    • Purpose: Constructs AR and MA coefficient matrices and constants from parameter estimates, incorporating structural identification if applicable.

    • Parameters:

      • par (np.ndarray): Parameter estimates.
      • p (int, optional): AR lag order (default: None).
      • q (int, optional): MA lag order (default: None).
    • Returns:

      • tuple: (ARs, MAs, Cst, indexes)

        • ARs (np.ndarray): AR coefficient matrices (p, K, K).
        • MAs (np.ndarray): MA coefficient matrices (q, K, K).
        • Cst (np.ndarray): Constant terms (1, K).
        • indexes (list): Tuples mapping parameter indices to matrix positions.
    • Notes:

      • Uses self.AR_s and self.MA_s for structural identification.
      • Non-structural estimation distributes parameters based on p and q.
      • Applies inverse of Ph0 for structural models.
  • LL_func(par, p: Optional[int] = None, q: Optional[int] = None, verbose: bool = False)

    • Purpose: Computes the negative log-likelihood for a VARMA model given parameters and lag orders.

    • Parameters:

      • par (np.ndarray): Parameter estimates.
      • p (int, optional): AR lag order (default: None).
      • q (int, optional): MA lag order (default: None).
      • verbose (bool, optional): Warns if Sigma is singular (default: False).
    • Returns:

      • float: Negative log-likelihood value.
    • Notes:

      • Constructs AR and MA matrices using prepare_A_B_Matrices.
      • Computes residuals and covariance (Sigma).
      • Returns large penalty (1e7) if Sigma is singular and verbose is True.
  • _build_companion(matrix, K: int, order: int)

    • Purpose: Constructs the companion matrix for AR or MA components to assess stability or invertibility.

    • Parameters:

      • matrix (np.ndarray): AR or MA matrices (order, K, K).
      • K (int): Number of variables.
      • order (int): Lag order.
    • Returns:

      • np.ndarray: Companion matrix (K*order, K*order).
    • Notes:

      • Raises ValueError if input is invalid or resulting matrix not square.
      • Returns zero matrix if order=0.
      • Used for eigenvalue analysis.
  • numerical_hessian(par, p: int, q: int, epsilon: float = 1e-5)

    • Purpose: Computes the numerical Hessian of the log-likelihood function.

    • Parameters:

      • par (np.ndarray): Parameter estimates.
      • p (int): AR lag order.
      • q (int): MA lag order.
      • epsilon (float, optional): Step size for central difference (default: 1e-5).
    • Returns:

      • np.ndarray: Hessian matrix (n, n).
    • Notes:

      • Symmetrizes Hessian for stability.
      • Calls LL_func to evaluate log-likelihood.
  • compute_diags(gs: bool = False, model: Optional[dict] = None)

    • Purpose: Computes diagnostic statistics (SE, t-values, p-values, significance) for fitted VARMA parameters.

    • Parameters:

      • gs (bool, optional): Use grid search mode (default: False).
      • model (dict, optional): Model dictionary for grid search (default: None).
    • Returns:

      • dict: se, tvals, pvals, signif.
    • Notes:

      • Uses numerical_hessian.
      • Raises ValueError if gs=True but model is missing.
  • get_resids(A, B, Cst)

    • Purpose: Computes residuals for a VARMA model given AR, MA, and constant terms.

    • Parameters:

      • A (np.ndarray): AR coefficient matrices (p, K, K).
      • B (np.ndarray): MA coefficient matrices (q, K, K).
      • Cst (np.ndarray): Constant terms (K,).
    • Returns:

      • np.ndarray: Residuals (K, T).
    • Notes:

      • Computes residuals as Y[:, t] - (Cst + AR*Y + MA*Eps).
      • Stores residuals in self.best_model['residuals'].
  • get_crit(value, num_params: int, crit_name: str)

    • Purpose: Computes information criteria (AIC, BIC, HQIC) for model evaluation.

    • Parameters:

      • value (np.ndarray): Residuals from the model.
      • num_params (int): Number of parameters.
      • crit_name (str): Criterion to compute ("aic", "bic", "hqic").
    • Returns:

      • float: Value of specified information criterion.
    • Notes:

      • Uses log-determinant of residual covariance.
      • Applies penalties based on parameter count and sample size.
  • display_results(results)

    • Purpose: Displays a formatted summary of VARMA model results including parameter estimates and diagnostics.

    • Parameters:

      • results (dict): Model results containing parameters, residuals, and diagnostics.
    • Returns:

      • pd.DataFrame: Parameter estimates and diagnostics.
    • Notes:

      • Prints metadata: log-likelihood, AIC, BIC, HQIC, model type.
      • Displays contemporaneous impact matrix (Ph0) for structural models.
      • Organizes parameters by type: constant, AR, MA, or Ph0 and lag.
  • plot_fitted()

    • Purpose: Plots fitted values versus actual data for each VARMA variable.

    • Parameters: None

    • Returns: None

    • Notes:

      • Requires a fitted model (self.best_model).
      • Plots actual vs fitted values in separate subplots per variable.
  • fit(p: Optional[int] = None, q: Optional[int] = None, plot: bool = True, verbose: bool = True, enforce_stab_inver: bool = False)

    • Purpose: Fits a VARMA model to the data using CMLE. Initial parameters are obtained with the Hannan-Rissanen method, then refined via likelihood optimization. Supports both specified AR/MA orders or grid search over possible orders.

    • Parameters:

      • p (int, optional): AR lag order (default: None).

      • q (int, optional): MA lag order (default: None).

      • plot (bool, optional): If True, plots fitted values (default: True).

      • verbose (bool, optional): Prints fitting progress and results (default: True).

      • enforce_stab_inver (bool, optional): Enforces stability/invertibility constraints (default: False).

    • Returns:

      • dict: Best model results, including parameters, AR/MA matrices, residuals, and diagnostics.
    • Notes: *

      • If p or q is None and structural_id=False, a grid search over max_p and max_q is performed.

      • Uses trust-constr optimization with optional bounds and constraints.

      • Updates self.best_model, self.best_p, and self.best_q with the best-fitting model.

      • Conditional Maximum Likelihood Estimation with Hannan-Rissanen initialization ensures consistent and efficient estimation, providing reliable starting values even for complex VARMA structures.

  • run_full_diagnosis(num_lags: int = 8, plot: bool = False, threshold: Optional[float] = None)

    • Purpose: Performs comprehensive diagnostic checks including stability, invertibility, autocorrelation, heteroscedasticity, normality, and structural breaks.

    • Parameters:

      • num_lags (int, optional): Lags for autocorrelation tests (default: 8).
      • plot (bool, optional): Displays diagnostic plots (default: False).
      • threshold (float, optional): Threshold for final fit score (default: self.Threshold or 0.8).
    • Returns:

      • dict: Diagnostic results, including scores and pass/fail for each test.
    • Notes:

      • Checks stability (AR eigenvalues), invertibility (MA eigenvalues), autocorrelation (Ljung-Box/Durbin-Watson), heteroscedasticity (ARCH), normality (Jarque-Bera/Shapiro), structural breaks (CUSUM).
      • Computes weighted final score.
      • Plots CUSUM, residual histograms, Q-Q plots if plot=True.
  • predict(n_periods: int = 10, plot: bool = True)

    • Purpose: Generates forecasts for the VARMA model over specified periods, including confidence intervals.

    • Parameters:

      • n_periods (int, optional): Number of periods to forecast (default: 10).
      • plot (bool, optional): Plots forecasts with confidence intervals (default: True).
    • Returns:

      • dict:

        • point (pd.DataFrame): Point forecasts.
        • ci_lower (pd.DataFrame): Lower confidence bounds.
        • ci_upper (pd.DataFrame): Upper confidence bounds.
    • Notes:

      • Uses AR, MA, and constants from fitted model.
      • Computes forecast variances using psi weights.
      • Supports datetime or integer indices.
  • simulate(n_periods: int = 100, plot: bool = True)

    • Purpose: Simulates future VARMA paths using bootstrap residuals or multivariate normal shocks.

    • Parameters:

      • n_periods (int, optional): Number of periods (default: 100).
      • plot (bool, optional): Plots simulated series (default: True).
    • Returns:

      • dict:

        • simulations (np.ndarray): Simulated series (n_periods, K).
    • Notes:

      • Requires a fitted model.
      • Uses bootstrap residuals if available, else generates multivariate normal shocks.
      • Plots each variable if plot=True.
  • impulse_res(h: int = 10, bootstrap: bool = True, n_boot: int = 1000, plot: bool = True)

    • Purpose: Computes impulse response functions (IRFs) describing variable responses to shocks.

    • Parameters:

      • h (int, optional): Horizon for IRFs (default: 10).
      • bootstrap (bool, optional): Bootstrap confidence intervals (default: True).
      • n_boot (int, optional): Bootstrap iterations (default: 1000).
      • plot (bool, optional): Plots IRFs (default: True).
    • Returns:

      • dict:

        • irf (np.ndarray): IRFs (h+1, K, K).
        • ci_lower (np.ndarray, optional): Lower bootstrap bounds.
        • ci_upper (np.ndarray, optional): Upper bootstrap bounds.
    • Notes:

      • Uses Cholesky decomposition or eigendecomposition.
      • Plots IRFs for each variable-shock pair if plot=True.
  • FEVD(h: int = 10, plot: bool = True)

    • Purpose: Computes forecast error variance decomposition (FEVD) showing shock contributions to variance of each variable.

    • Parameters:

      • h (int, optional): Horizon for FEVD (default: 10).
      • plot (bool, optional): Plots stacked bar charts (default: True).
    • Returns:

      • np.ndarray: FEVD (h, K, K); element [i,j,k] = contribution of shock k to variance of variable j at horizon i.
    • Notes:

      • Normalizes variance contributions to sum to 1.
      • Plots stacked bar charts if plot=True.

Example Usage

1. Generate Sample Data

Let's create a synthetic 2-variable VARMA(1,1) dataset to work with.

python
import numpy as np
import pandas as pd
from varma_class import VARMA # Assuming the class is saved in varma_class.py

# --- Data Generation ---
np.random.seed(123)
n_obs = 300
# True VARMA(1,1) parameters
A1 = np.array([[0.6, 0.3], [-0.2, 0.5]]) # AR(1) matrix
B1 = np.array([[0.4, 0.1], [0.1, 0.3]]) # MA(1) matrix

# Generate errors and data
errors = np.random.normal(0, 1, (n_obs, 2))
data = np.zeros((n_obs, 2))
for t in range(1, n_obs):
    ar_part = A1 @ data[t-1, :]
    ma_part = B1 @ errors[t-1, :]
    data[t, :] = ar_part + errors[t, :] + ma_part

df = pd.DataFrame(data, columns=['Real_Interest_Rate', 'Inflation_Gap'])

print("Sample Data Head:")
print(df.head())
Sample Data Head:
   Real_Interest_Rate  Inflation_Gap
0            0.000000       0.000000
1           -0.234153       1.579213
2            0.045259       1.417572
3            0.697883       0.512393
4            0.635889       0.334232

2. Standard VARMA with Automatic Order Selection

This is the go-to approach for forecasting. The model will search for the best (p,q) combination based on AIC.

python
print("\n--- Fitting a Standard VARMA Model ---")

# Initialize with structural_id=False for a standard model
model = VARMA(df, max_p=3, max_q=3, criterion='AIC', structural_id=False)

# This will run the grid search, fit the best model, and print diagnostics
best_model_results = model.fit()

# Generate and plot forecasts
forecasts = model.predict(n_periods=12)

3. Structural VARMA using Echelon Form

Use this mode when you need to identify structural shocks for policy analysis or economic interpretation. The fit method will automatically handle the Kronecker index identification internally.

python
print("\n--- Fitting a Structural (Echelon) VARMA Model ---")

# Set structural_id=True to enable the echelon form estimation
model_s = VARMA(df, max_p=4, max_q=4, structural_id=True)

# Fit the model. It will first find Kronecker indices and then estimate
# the restricted (echelon) VARMA model.
res_s = model_s.fit()

# Now, IRFs and FEVD are based on the identified structural shocks
irf_s = model_s.impulse_res(h=24)
fevd_s = model_s.FEVD(h=24)

# You can also access the identified indices
print(f"\nIdentified Kronecker Indices: {model_s.Kronind}")