econometron.Models.VectorAutoReg.VARMA
classin theeconometron.Models.VectorAutoRegmodule
Overview
A Vector Autoregressive Moving Average (VARMA) model captures dynamic relationships among multiple time series variables, combining both autoregressive (AR) and moving average (MA) components. The VARMA class in econometron.Models.timeseries allows specification, estimation, forecasting, and structural analysis of VARMA models for standard and advanced economic applications.
VARMA Model Specification
For endogenous variables , a VARMA(p, q) system is:
where:
- are AR coefficient matrices capturing past value effects.
- are MA coefficient matrices capturing the propagation of past shocks.
- is a vector of innovations with covariance .
Example (VARMA(1,1))
- Units: original data scale
- Use case: multivariate forecasting and structural economic analysis
This implementation supports both standard and advanced use cases, with two primary modes:
Non-Structural Estimation: Fits a standard VARMA model, primarily for forecasting.
Structural Identification: Uses the Echelon (canonical) form with Kronecker indices to impose zero restrictions on AR and MA coefficient matrices, resulting in a more parsimonious and statistically identified model. This is essential for economic or policy analysis, e.g., understanding structural shocks.
Estimation Method:
- Initial parameter estimates are computed using the Hannan-Rissanen method, which fits a high-order VAR to approximate residuals and generate starting values.
- Full model parameters are then refined using Conditional Maximum Likelihood Estimation (CMLE), optionally enforcing stability and invertibility constraints.
Inheritance from VAR
The VARMA class extends VAR, adding MA dynamics and structural analysis:
- Gains all VAR functionalities: data validation, stationarity tests, AR lag selection, fitting, forecasting, IRFs, and automated workflows (
SbSandEL_RAPIDO). - Adds MA coefficient estimation, lag order selection for MA terms, and echelon-based structural identification.
- Setting reduces
VARMAto a standardVARmodel, ensuring full compatibility with existing VAR workflows.
A Practical Guide to Echelon Form (Kronecker Indices)
When you have multiple time series, estimating all possible parameter interactions can be overkill. You might be estimating a bunch of coefficients that are actually zero, making your model unnecessarily complex.
The Echelon form, identified using Kronecker indices, solves this problem. It imposes zero-restrictions on the model's parameter matrices, resulting in a more parsimonious and statistically identified model.
Think of it like this: instead of assuming every variable’s past equally affects every other variable up to lag p, the algorithm determines a specific lag order for each variable in the system. This creates a custom-fit structure that only estimates the parameters that truly matter, which is essential for meaningful structural analysis.
In Practice
- Use Non-Structural Estimation for general multivariate forecasting.
- Use Structural Identification with echelon form to uncover causal relationships and policy-relevant shocks.
VARMAmodels richer dynamics than VAR, while remaining backward-compatible.
class VARMA capabilities
- Fit standard and structural VARMA models
- Automatic selection of AR and MA orders ()
- Compute forecasts, simulations, and impulse response functions
- Comprehensive diagnostics and model validation
- Structural analysis via echelon form and Kronecker indices
- Fully inherits VAR functionalities
Initialization
We start by creating a VARMA object with your time series data and configuration settings.
from econometron.Models.VectorAutoReg.VARMA import VARMA
varma_model=VARMA(data:pd.DataFrame , max_p:int=5, max_q:int=5, columns:list=None, forecast_h:int=6, plot:bool=True, check_stationarity:bool=True, bootstrap_n:int=1000, criterion:str='AIC', structural_id:bool=False, ci_alpha:int=0.05, Key:str=None, Threshold:int=0.8, orth:bool=False, enforce_stab_inver:bool=False):Parameters
| Attribute | Type | Description |
|---|---|---|
data | pd.DataFrame | Your input time series data, with each column representing a variable. |
max_p | int | The maximum AR order (lags of the variables) to check during the automatic grid search. Default is 5. |
max_q | int | The maximum MA order (lags of the errors) to check during the automatic grid search. Default is 5. |
columns | list, optional | A list of column names to use. If None, all numeric columns are included. |
forecast_h | int | The default number of steps to forecast ahead. Default is 6. |
plot | bool | If True, automatically generates plots for diagnostics, forecasts, etc. Default is True. |
check_stationarity | bool | If True, runs ADF and KPSS tests on your data when you initialize the class. Default is True. |
bootstrap_n | int | The number of bootstrap repetitions for generating confidence intervals for IRFs. Default is 1000. |
criterion | str | The information criterion ('AIC', 'BIC', 'HQIC') for selecting the best (p,q) orders. Default is 'AIC'. |
structural_id | bool | If True, the model will be estimated in its structural (echelon) form using Kronecker indices. Default is False. |
ci_alpha | float | The significance level for confidence intervals (e.g., 0.05 for a 95% CI). Default is 0.05. |
Key | str, optional | The operational mode.Not Yet Implemented |
Threshold | float | A score between 0 and 1. The model is considered "well-fitted" only if its final diagnostic score exceeds this value. Default is 0.8. |
orth | bool | If True, computes Orthogonalized Impulse Responses using a Cholesky decomposition for non-structural models. Default is False. |
enforce_stab_inver | bool | If True, forces the estimation process to find parameters that result in a stable and invertible model. Can be slow. Default is False. |
Class Attributes
These attributes get filled in after you've fitted the model. They hold all the important results.
| Attribute | Type | Description |
|---|---|---|
best_model | dict | A dictionary holding all results for the best-fitting model, including parameters, residuals, and criteria. |
best_p | int | The selected optimal AR order p. |
best_q | int | The selected optimal MA order q. |
Kronind | np.ndarray | An array of the estimated Kronecker indices if you ran a structural identification. |
AR_s, MA_s | np.ndarray | The structural restriction "masks" (matrices of 0s and 1s) for the AR and MA parameters in echelon form. |
Ph0 | np.ndarray | The estimated contemporaneous impact matrix from a structural model, showing how shocks instantly affect variables. |
fitted | bool | Becomes True if the final model passes the diagnostic checks based on the Threshold. |
Methods
kron_index(lag: int)Purpose: Computes the Kronecker indices for a multivariate time series to determine the structural order of a VARMA model. Uses Canonical Correlation Analysis (CCA) to identify the lag structure by testing the significance of correlations between past and future values.
Parameters:
lag(int): Maximum lag to consider for the Kronecker index computation.
Returns:
dict: Dictionary containing:index(np.ndarray): Kronecker indices for each variable.tests(list): Test statistics for each component, including test value, degrees of freedom, p-value, and dsq (if applicable).
Notes:
- Assumes
self.datais a pandas DataFrame with shape(T, K). - Uses CCA to assess the relationship between past and future values.
- Applies a chi-squared test to determine significance, using
self.ci_alphaas threshold. - Outputs progress and results for each component if significant.
- Assumes
struct_id(ord: Optional[int] = None, use_var: bool = False, output: bool = True)Purpose: Identifies structural AR and MA coefficient matrices for a VARMA model based on Kronecker indices. Optionally uses VAR to determine the lag order if
ordis not provided.Parameters:
ord(int, optional): Specific lag order for the model (default: None).use_var(bool, optional): If True, uses VAR to estimate the lag order (default: False).output(bool, optional): If True, prints AR and MA coefficient matrices (default: True).
Returns:
dict: Dictionary containing:AR_s_id(np.ndarray): Structural AR coefficient matrix.MA_s_id(np.ndarray): Structural MA coefficient matrix.
Notes:
- Sets
self.best_pandself.best_qbased on the maximum Kronecker index. - Constructs sparse AR and MA matrices based on Kronecker indices.
- If
use_varis True, estimates the lag order using a VAR model with a maximum lag based on the square root of sample size orself.max_p.
- Sets
_ini_s1(ord: Optional[int] = None, output: bool = True, p: Optional[int] = None, q: Optional[int] = None)Purpose: Initializes parameter estimates for a VARMA model using either structural identification or a non-structural approach via OLS on lagged variables and residuals.
Parameters:
ord(int, optional): Specific lag order for structural identification (default: None).output(bool, optional): If True, prints estimation details (default: True).p(int, optional): AR lag order for non-structural estimation (default: None).q(int, optional): MA lag order for non-structural estimation (default: None).
Returns:
tuple:(estimates, standard_errors)estimates(list ornp.ndarray): Estimated parameters.standard_errors(list ornp.ndarray): Standard errors of the estimates.
Notes:
- Uses
struct_idresults ifself.structural_idis True. - For non-structural estimation, constructs regressors from lagged variables (AR) and residuals (MA).
- Maximum lag order (
max_pq) is determined based on sample size or user-specifiedmax_p/max_q.
- Uses
_prepare_for_est(estimates, stand_err, output: bool = True)Purpose: Prepares parameter estimates and standard errors for optimization by flattening, filtering, and computing confidence bounds.
Parameters:
estimates(list ornp.ndarray): Parameter estimates.stand_err(list ornp.ndarray): Standard errors of the estimates.output(bool, optional): Prints a table of initial estimates, bounds, and standard errors (default: True).
Returns:
tuple:(par, separ, lowerBounds, upperBounds)par(np.ndarray): Flattened parameter estimates.separ(np.ndarray): Flattened standard errors.lowerBounds(np.ndarray): Lower confidence bounds (par - 2 * separ).upperBounds(np.ndarray): Upper confidence bounds (par + 2 * separ).
Notes:
- Raises ValueError if estimates or standard errors are None or empty.
- Concatenates lists for structural identification.
- Prints formatted table if
outputis True.
prepare_A_B_Matrices(par, p: Optional[int] = None, q: Optional[int] = None)Purpose: Constructs AR and MA coefficient matrices and constants from parameter estimates, incorporating structural identification if applicable.
Parameters:
par(np.ndarray): Parameter estimates.p(int, optional): AR lag order (default: None).q(int, optional): MA lag order (default: None).
Returns:
tuple:(ARs, MAs, Cst, indexes)ARs(np.ndarray): AR coefficient matrices(p, K, K).MAs(np.ndarray): MA coefficient matrices(q, K, K).Cst(np.ndarray): Constant terms(1, K).indexes(list): Tuples mapping parameter indices to matrix positions.
Notes:
- Uses
self.AR_sandself.MA_sfor structural identification. - Non-structural estimation distributes parameters based on
pandq. - Applies inverse of
Ph0for structural models.
- Uses
LL_func(par, p: Optional[int] = None, q: Optional[int] = None, verbose: bool = False)Purpose: Computes the negative log-likelihood for a VARMA model given parameters and lag orders.
Parameters:
par(np.ndarray): Parameter estimates.p(int, optional): AR lag order (default: None).q(int, optional): MA lag order (default: None).verbose(bool, optional): Warns if Sigma is singular (default: False).
Returns:
float: Negative log-likelihood value.
Notes:
- Constructs AR and MA matrices using
prepare_A_B_Matrices. - Computes residuals and covariance (
Sigma). - Returns large penalty (1e7) if
Sigmais singular andverboseis True.
- Constructs AR and MA matrices using
_build_companion(matrix, K: int, order: int)Purpose: Constructs the companion matrix for AR or MA components to assess stability or invertibility.
Parameters:
matrix(np.ndarray): AR or MA matrices(order, K, K).K(int): Number of variables.order(int): Lag order.
Returns:
np.ndarray: Companion matrix(K*order, K*order).
Notes:
- Raises ValueError if input is invalid or resulting matrix not square.
- Returns zero matrix if
order=0. - Used for eigenvalue analysis.
numerical_hessian(par, p: int, q: int, epsilon: float = 1e-5)Purpose: Computes the numerical Hessian of the log-likelihood function.
Parameters:
par(np.ndarray): Parameter estimates.p(int): AR lag order.q(int): MA lag order.epsilon(float, optional): Step size for central difference (default: 1e-5).
Returns:
np.ndarray: Hessian matrix(n, n).
Notes:
- Symmetrizes Hessian for stability.
- Calls
LL_functo evaluate log-likelihood.
compute_diags(gs: bool = False, model: Optional[dict] = None)Purpose: Computes diagnostic statistics (SE, t-values, p-values, significance) for fitted VARMA parameters.
Parameters:
gs(bool, optional): Use grid search mode (default: False).model(dict, optional): Model dictionary for grid search (default: None).
Returns:
dict:se,tvals,pvals,signif.
Notes:
- Uses
numerical_hessian. - Raises ValueError if
gs=Truebutmodelis missing.
- Uses
get_resids(A, B, Cst)Purpose: Computes residuals for a VARMA model given AR, MA, and constant terms.
Parameters:
A(np.ndarray): AR coefficient matrices(p, K, K).B(np.ndarray): MA coefficient matrices(q, K, K).Cst(np.ndarray): Constant terms(K,).
Returns:
np.ndarray: Residuals(K, T).
Notes:
- Computes residuals as
Y[:, t] - (Cst + AR*Y + MA*Eps). - Stores residuals in
self.best_model['residuals'].
- Computes residuals as
get_crit(value, num_params: int, crit_name: str)Purpose: Computes information criteria (AIC, BIC, HQIC) for model evaluation.
Parameters:
value(np.ndarray): Residuals from the model.num_params(int): Number of parameters.crit_name(str): Criterion to compute ("aic","bic","hqic").
Returns:
float: Value of specified information criterion.
Notes:
- Uses log-determinant of residual covariance.
- Applies penalties based on parameter count and sample size.
display_results(results)Purpose: Displays a formatted summary of VARMA model results including parameter estimates and diagnostics.
Parameters:
results(dict): Model results containing parameters, residuals, and diagnostics.
Returns:
pd.DataFrame: Parameter estimates and diagnostics.
Notes:
- Prints metadata: log-likelihood, AIC, BIC, HQIC, model type.
- Displays contemporaneous impact matrix (
Ph0) for structural models. - Organizes parameters by type: constant, AR, MA, or Ph0 and lag.
plot_fitted()Purpose: Plots fitted values versus actual data for each VARMA variable.
Parameters: None
Returns: None
Notes:
- Requires a fitted model (
self.best_model). - Plots actual vs fitted values in separate subplots per variable.
- Requires a fitted model (
fit(p: Optional[int] = None, q: Optional[int] = None, plot: bool = True, verbose: bool = True, enforce_stab_inver: bool = False)Purpose: Fits a VARMA model to the data using CMLE. Initial parameters are obtained with the Hannan-Rissanen method, then refined via likelihood optimization. Supports both specified AR/MA orders or grid search over possible orders.
Parameters:
p(int, optional): AR lag order (default: None).q(int, optional): MA lag order (default: None).plot(bool, optional): If True, plots fitted values (default: True).verbose(bool, optional): Prints fitting progress and results (default: True).enforce_stab_inver(bool, optional): Enforces stability/invertibility constraints (default: False).
Returns:
dict: Best model results, including parameters, AR/MA matrices, residuals, and diagnostics.
Notes: *
If
porqis None andstructural_id=False, a grid search overmax_pandmax_qis performed.Uses trust-constr optimization with optional bounds and constraints.
Updates
self.best_model,self.best_p, andself.best_qwith the best-fitting model.Conditional Maximum Likelihood Estimation with
Hannan-Rissaneninitialization ensures consistent and efficient estimation, providing reliable starting values even for complex VARMA structures.
run_full_diagnosis(num_lags: int = 8, plot: bool = False, threshold: Optional[float] = None)Purpose: Performs comprehensive diagnostic checks including stability, invertibility, autocorrelation, heteroscedasticity, normality, and structural breaks.
Parameters:
num_lags(int, optional): Lags for autocorrelation tests (default: 8).plot(bool, optional): Displays diagnostic plots (default: False).threshold(float, optional): Threshold for final fit score (default:self.Thresholdor 0.8).
Returns:
dict: Diagnostic results, including scores and pass/fail for each test.
Notes:
- Checks stability (AR eigenvalues), invertibility (MA eigenvalues), autocorrelation (Ljung-Box/Durbin-Watson), heteroscedasticity (ARCH), normality (Jarque-Bera/Shapiro), structural breaks (CUSUM).
- Computes weighted final score.
- Plots CUSUM, residual histograms, Q-Q plots if
plot=True.
predict(n_periods: int = 10, plot: bool = True)Purpose: Generates forecasts for the VARMA model over specified periods, including confidence intervals.
Parameters:
n_periods(int, optional): Number of periods to forecast (default: 10).plot(bool, optional): Plots forecasts with confidence intervals (default: True).
Returns:
dict:point(pd.DataFrame): Point forecasts.ci_lower(pd.DataFrame): Lower confidence bounds.ci_upper(pd.DataFrame): Upper confidence bounds.
Notes:
- Uses AR, MA, and constants from fitted model.
- Computes forecast variances using psi weights.
- Supports datetime or integer indices.
simulate(n_periods: int = 100, plot: bool = True)Purpose: Simulates future VARMA paths using bootstrap residuals or multivariate normal shocks.
Parameters:
n_periods(int, optional): Number of periods (default: 100).plot(bool, optional): Plots simulated series (default: True).
Returns:
dict:simulations(np.ndarray): Simulated series(n_periods, K).
Notes:
- Requires a fitted model.
- Uses bootstrap residuals if available, else generates multivariate normal shocks.
- Plots each variable if
plot=True.
impulse_res(h: int = 10, bootstrap: bool = True, n_boot: int = 1000, plot: bool = True)Purpose: Computes impulse response functions (IRFs) describing variable responses to shocks.
Parameters:
h(int, optional): Horizon for IRFs (default: 10).bootstrap(bool, optional): Bootstrap confidence intervals (default: True).n_boot(int, optional): Bootstrap iterations (default: 1000).plot(bool, optional): Plots IRFs (default: True).
Returns:
dict:irf(np.ndarray): IRFs(h+1, K, K).ci_lower(np.ndarray, optional): Lower bootstrap bounds.ci_upper(np.ndarray, optional): Upper bootstrap bounds.
Notes:
- Uses Cholesky decomposition or eigendecomposition.
- Plots IRFs for each variable-shock pair if
plot=True.
FEVD(h: int = 10, plot: bool = True)Purpose: Computes forecast error variance decomposition (FEVD) showing shock contributions to variance of each variable.
Parameters:
h(int, optional): Horizon for FEVD (default: 10).plot(bool, optional): Plots stacked bar charts (default: True).
Returns:
np.ndarray: FEVD(h, K, K); element[i,j,k]= contribution of shockkto variance of variablejat horizoni.
Notes:
- Normalizes variance contributions to sum to 1.
- Plots stacked bar charts if
plot=True.
Example Usage
1. Generate Sample Data
Let's create a synthetic 2-variable VARMA(1,1) dataset to work with.
import numpy as np
import pandas as pd
from varma_class import VARMA # Assuming the class is saved in varma_class.py
# --- Data Generation ---
np.random.seed(123)
n_obs = 300
# True VARMA(1,1) parameters
A1 = np.array([[0.6, 0.3], [-0.2, 0.5]]) # AR(1) matrix
B1 = np.array([[0.4, 0.1], [0.1, 0.3]]) # MA(1) matrix
# Generate errors and data
errors = np.random.normal(0, 1, (n_obs, 2))
data = np.zeros((n_obs, 2))
for t in range(1, n_obs):
ar_part = A1 @ data[t-1, :]
ma_part = B1 @ errors[t-1, :]
data[t, :] = ar_part + errors[t, :] + ma_part
df = pd.DataFrame(data, columns=['Real_Interest_Rate', 'Inflation_Gap'])
print("Sample Data Head:")
print(df.head())Sample Data Head:
Real_Interest_Rate Inflation_Gap
0 0.000000 0.000000
1 -0.234153 1.579213
2 0.045259 1.417572
3 0.697883 0.512393
4 0.635889 0.3342322. Standard VARMA with Automatic Order Selection
This is the go-to approach for forecasting. The model will search for the best (p,q) combination based on AIC.
print("\n--- Fitting a Standard VARMA Model ---")
# Initialize with structural_id=False for a standard model
model = VARMA(df, max_p=3, max_q=3, criterion='AIC', structural_id=False)
# This will run the grid search, fit the best model, and print diagnostics
best_model_results = model.fit()
# Generate and plot forecasts
forecasts = model.predict(n_periods=12)3. Structural VARMA using Echelon Form
Use this mode when you need to identify structural shocks for policy analysis or economic interpretation. The fit method will automatically handle the Kronecker index identification internally.
print("\n--- Fitting a Structural (Echelon) VARMA Model ---")
# Set structural_id=True to enable the echelon form estimation
model_s = VARMA(df, max_p=4, max_q=4, structural_id=True)
# Fit the model. It will first find Kronecker indices and then estimate
# the restricted (echelon) VARMA model.
res_s = model_s.fit()
# Now, IRFs and FEVD are based on the identified structural shocks
irf_s = model_s.impulse_res(h=24)
fevd_s = model_s.FEVD(h=24)
# You can also access the identified indices
print(f"\nIdentified Kronecker Indices: {model_s.Kronind}")