`econometron.estimation.regression`

ols_estimator Function

Overview

The ols_estimator function implements Ordinary Least Squares (OLS) regression for estimating parameters of a linear model. It supports multivariate regression, allowing multiple dependent variables to be regressed on a set of independent variables.

The function computes:

Parameter estimates (beta)
Fitted values (fitted)
Residuals (resid)
Diagnostic statistics (res) including R-squared, standard errors, z-values, p-values, and log-likelihood.

It automatically handles intercept inclusion, works with both NumPy arrays and pandas DataFrames, and is suitable for econometric and statistical applications.

Linear Regression Model

The OLS estimator fits a model:

Where:

: Dependent variable matrix (T × K, T = observations, K = dependent variables)
: Independent variable matrix (T × M, M = regressors including intercept if added)
: Regression coefficients (M × K)
: Normally distributed error term with covariance (K × K)

OLS minimizes the sum of squared residuals:

The solution is:

Function Definition

python

from econometron.regression import ols_estimator

beta, fitted, resid, res = ols_estimator(X, Y, add_intercept=None, tol=1e-6)

Parameters

Name	Type	Description	Default
`X`	`np.ndarray` or `pd.DataFrame`	Independent variables (T × M)	None
`Y`	`np.ndarray` or `pd.DataFrame`	Dependent variables (T × K)	None
`add_intercept`	`bool` or `None`	If True, adds intercept. If None, adds if X is not mean-centered.	None
`tol`	`float`	Tolerance for checking mean-centering (used if `add_intercept=None`)	1e-6

Returns

beta (np.ndarray): Estimated coefficients (M × K)
fitted (np.ndarray): Fitted values (T × K)
resid (np.ndarray): Residuals (T × K)
res (dict): Diagnostics
- resid: Residuals
- se: Standard errors of coefficients (M × K)
- z_values: Z-statistics (M × K)
- p_values: P-values (M × K)
- R2: Overall R-squared
- R2_per_var: R-squared per dependent variable (K)
- log_likelihood: Model log-likelihood

Function Details

Purpose: Performs OLS regression, returning coefficients and diagnostics for model evaluation. Handles numerical issues robustly using np.linalg.lstsq and pinv for singular matrices.

Key Steps:

Input Validation
- Checks that X and Y are not empty.
- Confirms matching observation counts (T).
- Converts DataFrames to NumPy arrays.
Intercept Handling
- add_intercept=None: Checks column means against tol to decide on intercept.
- add_intercept=True: Adds a column of ones.
- add_intercept=False: Uses X as provided.
OLS Estimation
- Computes using np.linalg.lstsq.
- Calculates fitted values and residuals.
Diagnostics
- Residual Sum of Squares (RSS) and Total Sum of Squares (TSS)
- Overall and per-variable R-squared
- Error variance per variable
- Standard errors, z-values, p-values (t-distribution for small T, normal otherwise)
- Log-likelihood assuming multivariate normal errors (adds for stability)

Usage Example

python

import numpy as np
import pandas as pd
from econometron.regression import ols_estimator

# Synthetic data
np.random.seed(42)
T, M, K = 100, 2, 3
X = np.random.randn(T, M)
true_beta = np.array([[1, 0.5, -0.2], [0.3, -0.7, 1.0]])
Y = X @ true_beta + np.random.randn(T, K) * 0.1

# Run OLS
beta, fitted, resid, res = ols_estimator(X, Y, add_intercept=True)

print("Estimated Coefficients:\n", beta)
print("R-squared (overall):", res['R2'])
print("R-squared (per variable):", res['R2_per_var'])
print("Standard Errors:\n", res['se'])
print("P-values:\n", res['p_values'])
print("Log-Likelihood:", res['log_likelihood'])

# Using pandas DataFrame
X_df = pd.DataFrame(X, columns=['x1', 'x2'])
Y_df = pd.DataFrame(Y, columns=['y1', 'y2', 'y3'])
beta_df, fitted_df, resid_df, res_df = ols_estimator(X_df, Y_df, add_intercept=None)
print("DataFrame Results - R-squared:", res_df['R2'])

Notes

Intercept Handling: Automatic if add_intercept=None, otherwise user-controlled.
Numerical Stability: Uses np.linalg.lstsq and pinv for singular matrices.
Diagnostics: Comprehensive statistics for hypothesis testing.
Flexibility: Supports NumPy arrays and pandas DataFrames.
P-values: Uses t-distribution for small samples; normal distribution otherwise.

econometron.utils.Estimation

econometron.utils.estimation.Bayesian

econometron.utils.estimation.MLE

econometron.utils.optimizers

econometron.utils.projection

`econometron.estimation.regression`

Overview

Linear Regression Model

Function Definition

Function Details

Usage Example

Notes

econometron.utils.estimation.Bayesian

econometron.utils.estimation.MLE

econometron.estimation.regression ​

Overview ​

Linear Regression Model ​

Function Definition ​

Function Details ​

Usage Example ​

Notes ​

`econometron.estimation.regression`

Overview

Linear Regression Model

Function Definition

Function Details

Usage Example

Notes