econometron.utils.Sampler
Overview: Markov Chain Monte Carlo (MCMC)
What is MCMC?
Markov Chain Monte Carlo (MCMC) is a class of computational algorithms used to sample from complex probability distributions, particularly when direct sampling is infeasible or analytical solutions are unavailable. In statistical and econometric applications, MCMC is widely used to estimate posterior distributions in Bayesian inference, where the goal is to characterize the distribution of model parameters given observed data. The posterior distribution combines prior beliefs about parameters with the likelihood of the data, as defined by Bayes' theorem:
Where:
- : Parameter vector.
- : Observed data.
- : Posterior distribution.
- : Likelihood function.
- : Prior distribution.
- : Marginal likelihood (normalizing constant, often intractable).
The challenge in Bayesian inference is that the posterior is often high-dimensional, non-standard, or lacks a closed-form expression, making direct computation or sampling difficult. MCMC methods address this by constructing a Markov chain whose stationary distribution is the target posterior distribution. By simulating the chain for many iterations, samples generated from the chain approximate draws from the posterior, enabling estimation of posterior moments (e.g., mean, variance) and credible intervals.
How MCMC Works
MCMC algorithms generate a sequence of parameter values that form a Markov chain, where each state depends only on the previous state . The chain is designed such that its stationary distribution matches the target posterior . Key properties of MCMC include:
- Markov Property: The next state depends only on the current state, not the history.
- Stationarity: After sufficient iterations (convergence), the chain’s samples approximate the target distribution.
- Ergodicity: The chain explores the entire parameter space, ensuring that long-run averages of samples converge to posterior expectations.
Common MCMC algorithms include Metropolis-Hastings, Gibbs Sampling, and Hamiltonian Monte Carlo. The function provided implements the Random Walk Metropolis (RWM) algorithm, a simple and widely used variant of Metropolis-Hastings.
Key Concepts in MCMC
- Proposal Distribution: At each iteration, a new parameter value is proposed from a proposal distribution (e.g., a normal distribution centered at the current state). The proposal is accepted or rejected based on an acceptance criterion.
- Acceptance Probability: The Metropolis-Hastings criterion determines whether to accept a proposed state by comparing the posterior probability of the proposed and current states. This allows the chain to occasionally accept worse states, helping it escape local modes.
- Burn-in Period: Initial iterations of the chain may not represent the target distribution due to the starting point’s influence. These are discarded as "burn-in" to ensure samples reflect the stationary distribution.
- Thinning: To reduce autocorrelation in samples, only every -th sample is retained (thinning interval), improving independence of the retained samples.
- Convergence Diagnostics: Techniques like trace plots, Gelman-Rubin statistics, or effective sample size are used to assess whether the chain has converged to the target distribution.
Applications in Econometrics
MCMC is extensively used in econometrics for:
- Bayesian Estimation: Estimating parameters of complex models like DSGE models, state-space models, or hierarchical models, where the posterior is intractable.
- Uncertainty Quantification: Computing credible intervals and posterior distributions for parameters, offering a richer description of uncertainty than point estimates.
- Model Comparison: Calculating posterior odds or Bayes factors for model selection.
- Forecasting: Generating predictive distributions for time-series data.
For example, in a State Space model, MCMC can sample from the posterior of parameters like shock persistence or variance, using the likelihood from a Kalman filter (as in econometron.filters.kalman_objective) and a prior distribution.
Advantages of MCMC
- Handles high-dimensional, non-standard distributions.
- Provides full posterior distributions, not just point estimates.
- Flexible for incorporating prior information.
- Robust to complex likelihood surfaces with multiple modes.
Limitations of MCMC
- Computational Cost: Requires many iterations, especially for high-dimensional problems or slow-mixing chains.
- Convergence Issues: Chains may take long to converge or get stuck in local modes if poorly initialized or tuned.
- Tuning: Proposal distributions (e.g., step size) must be carefully tuned to achieve optimal acceptance rates (typically 20–50% for RWM).
- Autocorrelation: Samples may be correlated, reducing effective sample size unless thinning is applied.
- Diagnostics Required: Users must verify convergence using diagnostic tools, which are not always straightforward.
Function: rwm(objecti_func, prior, x0, lb, ub, n_iter=10000, burn_in=1000, thin=1, sigma=0.1, seed=42, verbose=False)
Purpose
The rwm function implements the Random Walk Metropolis (RWM) algorithm, a specific case of the Metropolis-Hastings algorithm, for sampling from a posterior distribution in Bayesian inference. It is designed to generate samples from , where the likelihood is provided by objecti_func (log-likelihood) and the prior by prior (log-prior). The algorithm uses a symmetric normal proposal distribution and incorporates adaptive step-size tuning during the burn-in period to improve sampling efficiency.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
objecti_func | Callable | Log-likelihood function, taking a parameter vector and returning a scalar. | None |
prior | Callable | Log-prior function, taking a parameter vector and returning a scalar. Should handle bounds internally. | None |
x0 | array-like | Initial parameter vector (shape: n_params). | None |
lb | array-like | Lower bounds for parameters (shape: n_params). Used for clipping proposals. | None |
ub | array-like | Upper bounds for parameters (shape: n_params). Used for clipping proposals. | None |
n_iter | int | Total number of iterations to run the MCMC chain. | 10000 |
burn_in | int | Number of initial iterations to discard as burn-in. | 1000 |
thin | int | Thinning interval (store every thin-th sample after burn-in). | 1 |
sigma | float or array-like | Proposal step size(s). Scalar or vector of standard deviations for normal proposals. | 0.1 |
seed | int | Random seed for reproducibility. | 42 |
verbose | bool | If True, prints debugging information (acceptance rates, step sizes, etc.). | False |
Returns
dict: Dictionary containing:samples(np.ndarray): Posterior samples (shape: (n_iter // thin, n_params)).log_posterior(np.ndarray): Log-posterior values for stored samples (shape: n_iter // thin).acceptance_rate(float): Fraction of accepted proposals.mean_posterior_parameters(np.ndarray): Mean of the posterior samples (shape: n_params).std_posterior_parameters(np.ndarray): Standard deviation of the posterior samples (shape: n_params).message(str): Status message (e.g., success or failure reason).
Explanation
Input Validation:
- Converts
x0,lb,ub, andsigmato NumPy arrays with float type. - Ensures
lb,ub, andsigmahave the same length asx0, returning an error dictionary if not. - Sets the random seed using
np.random.seed(seed)for reproducibility.
- Converts
Initialization:
- Initializes arrays for storing samples (
samples) and log-posterior values (log_posterior) with sizen_iter // thin. - Evaluates the initial log-likelihood (
objecti_func(x0)) and log-prior (prior(x0)). - If the initial point is invalid, attempts up to 100 random samples within
[lb, ub].
- Initializes arrays for storing samples (
Main MCMC Loop:
For each iteration:
Proposal: Generates a proposed state:
Evaluation: Computes log-likelihood + log-prior → log-posterior.
Acceptance: Computes probability:
Accept/Reject: Accepts proposal with probability .
Storage: After burn-in, stores every
thin-th sample.
Adaptive Step Size:
- Adjusts
sigmaevery 50 iterations during burn-in to target 20–50% acceptance rate.
- Adjusts
Output:
- Computes acceptance rate, posterior mean, and standard deviation.
- Returns dictionary with results and status message.
Error Handling:
- Catches numerical exceptions and returns error information.
Example
import numpy as np
from econometron.utils.sampler import rwm
from scipy.stats import norm
# Define log-likelihood and prior
def log_likelihood(x): return -0.5 * np.sum(x**2) # Simple Gaussian likelihood
def log_prior(x): return norm.logpdf(x[0], 0, 1) + norm.logpdf(x[1], 0, 1)
# Run RWM
x0 = np.array([0.0, 0.0])
lb = np.array([-5.0, -5.0])
ub = np.array([5.0, 5.0])
result = rwm(log_likelihood, log_prior, x0, lb, ub, n_iter=10000, burn_in=1000, thin=2, sigma=0.5, seed=42, verbose=True)
print("Acceptance Rate:", result['acceptance_rate'])
print("Mean Posterior:", result['mean_posterior_parameters'])Illustrative Output:
Iteration 0: Accepted, log_post = -1.234
Iteration 1000: Accept rate = 0.320, Updated sigma = [0.48, 0.48]
...
RWM Summary:
Acceptance rate: 0.312
Mean posterior parameters: [0.021, -0.015]
Std posterior parameters: [0.987, 0.992]Notes
- Designed to work with
econometron.filters.kalman_objectivefor state-space models or other likelihood functions. - Uses symmetric normal proposal (Random Walk Metropolis).
- Adaptive tuning during burn-in improves acceptance rate.
- Verbose output prints acceptance rates and step sizes.
- Applicable to any Bayesian model with log-likelihood and log-prior.
Limitations
- Proposal step size (
sigma) significantly affects mixing and convergence. - RWM may produce correlated samples; thinning mitigates but increases computation.
- Random walk explores locally; inefficient for multimodal or high-dimensional posteriors.
- No built-in convergence diagnostics; users must implement trace plots or statistics.
- Choice of
burn_inaffects sample quality.
Dependencies
numpy(array operations, random numbers)scipy.stats.norm(for log-prior examples)econometron.utils.optimizers.evaluate_func(for log-likelihood evaluation)
Use Case in Context
Ideal for Bayesian estimation of econometric models such as DSGE models, using Kalman filter likelihoods and priors. Can be combined with genetic_algorithm_kalman or simulated_annealing_kalman to obtain posterior distributions. Posterior samples can be summarized with econometron.utils.estimation.results.create_results_table for credible intervals, plotting, and forecasting.
