Skip to content

econometron.utils.Sampler

Overview: Markov Chain Monte Carlo (MCMC)

What is MCMC?

Markov Chain Monte Carlo (MCMC) is a class of computational algorithms used to sample from complex probability distributions, particularly when direct sampling is infeasible or analytical solutions are unavailable. In statistical and econometric applications, MCMC is widely used to estimate posterior distributions in Bayesian inference, where the goal is to characterize the distribution of model parameters given observed data. The posterior distribution combines prior beliefs about parameters with the likelihood of the data, as defined by Bayes' theorem:

Where:

  • : Parameter vector.
  • : Observed data.
  • : Posterior distribution.
  • : Likelihood function.
  • : Prior distribution.
  • : Marginal likelihood (normalizing constant, often intractable).

The challenge in Bayesian inference is that the posterior is often high-dimensional, non-standard, or lacks a closed-form expression, making direct computation or sampling difficult. MCMC methods address this by constructing a Markov chain whose stationary distribution is the target posterior distribution. By simulating the chain for many iterations, samples generated from the chain approximate draws from the posterior, enabling estimation of posterior moments (e.g., mean, variance) and credible intervals.

How MCMC Works

MCMC algorithms generate a sequence of parameter values that form a Markov chain, where each state depends only on the previous state . The chain is designed such that its stationary distribution matches the target posterior . Key properties of MCMC include:

  • Markov Property: The next state depends only on the current state, not the history.
  • Stationarity: After sufficient iterations (convergence), the chain’s samples approximate the target distribution.
  • Ergodicity: The chain explores the entire parameter space, ensuring that long-run averages of samples converge to posterior expectations.

Common MCMC algorithms include Metropolis-Hastings, Gibbs Sampling, and Hamiltonian Monte Carlo. The function provided implements the Random Walk Metropolis (RWM) algorithm, a simple and widely used variant of Metropolis-Hastings.

Key Concepts in MCMC

  1. Proposal Distribution: At each iteration, a new parameter value is proposed from a proposal distribution (e.g., a normal distribution centered at the current state). The proposal is accepted or rejected based on an acceptance criterion.
  2. Acceptance Probability: The Metropolis-Hastings criterion determines whether to accept a proposed state by comparing the posterior probability of the proposed and current states. This allows the chain to occasionally accept worse states, helping it escape local modes.
  3. Burn-in Period: Initial iterations of the chain may not represent the target distribution due to the starting point’s influence. These are discarded as "burn-in" to ensure samples reflect the stationary distribution.
  4. Thinning: To reduce autocorrelation in samples, only every -th sample is retained (thinning interval), improving independence of the retained samples.
  5. Convergence Diagnostics: Techniques like trace plots, Gelman-Rubin statistics, or effective sample size are used to assess whether the chain has converged to the target distribution.

Applications in Econometrics

MCMC is extensively used in econometrics for:

  • Bayesian Estimation: Estimating parameters of complex models like DSGE models, state-space models, or hierarchical models, where the posterior is intractable.
  • Uncertainty Quantification: Computing credible intervals and posterior distributions for parameters, offering a richer description of uncertainty than point estimates.
  • Model Comparison: Calculating posterior odds or Bayes factors for model selection.
  • Forecasting: Generating predictive distributions for time-series data.

For example, in a State Space model, MCMC can sample from the posterior of parameters like shock persistence or variance, using the likelihood from a Kalman filter (as in econometron.filters.kalman_objective) and a prior distribution.

Advantages of MCMC

  • Handles high-dimensional, non-standard distributions.
  • Provides full posterior distributions, not just point estimates.
  • Flexible for incorporating prior information.
  • Robust to complex likelihood surfaces with multiple modes.

Limitations of MCMC

  • Computational Cost: Requires many iterations, especially for high-dimensional problems or slow-mixing chains.
  • Convergence Issues: Chains may take long to converge or get stuck in local modes if poorly initialized or tuned.
  • Tuning: Proposal distributions (e.g., step size) must be carefully tuned to achieve optimal acceptance rates (typically 20–50% for RWM).
  • Autocorrelation: Samples may be correlated, reducing effective sample size unless thinning is applied.
  • Diagnostics Required: Users must verify convergence using diagnostic tools, which are not always straightforward.

Function: rwm(objecti_func, prior, x0, lb, ub, n_iter=10000, burn_in=1000, thin=1, sigma=0.1, seed=42, verbose=False)

Purpose

The rwm function implements the Random Walk Metropolis (RWM) algorithm, a specific case of the Metropolis-Hastings algorithm, for sampling from a posterior distribution in Bayesian inference. It is designed to generate samples from , where the likelihood is provided by objecti_func (log-likelihood) and the prior by prior (log-prior). The algorithm uses a symmetric normal proposal distribution and incorporates adaptive step-size tuning during the burn-in period to improve sampling efficiency.


Parameters

NameTypeDescriptionDefault
objecti_funcCallableLog-likelihood function, taking a parameter vector and returning a scalar.None
priorCallableLog-prior function, taking a parameter vector and returning a scalar. Should handle bounds internally.None
x0array-likeInitial parameter vector (shape: n_params).None
lbarray-likeLower bounds for parameters (shape: n_params). Used for clipping proposals.None
ubarray-likeUpper bounds for parameters (shape: n_params). Used for clipping proposals.None
n_iterintTotal number of iterations to run the MCMC chain.10000
burn_inintNumber of initial iterations to discard as burn-in.1000
thinintThinning interval (store every thin-th sample after burn-in).1
sigmafloat or array-likeProposal step size(s). Scalar or vector of standard deviations for normal proposals.0.1
seedintRandom seed for reproducibility.42
verboseboolIf True, prints debugging information (acceptance rates, step sizes, etc.).False

Returns

  • dict: Dictionary containing:

    • samples (np.ndarray): Posterior samples (shape: (n_iter // thin, n_params)).
    • log_posterior (np.ndarray): Log-posterior values for stored samples (shape: n_iter // thin).
    • acceptance_rate (float): Fraction of accepted proposals.
    • mean_posterior_parameters (np.ndarray): Mean of the posterior samples (shape: n_params).
    • std_posterior_parameters (np.ndarray): Standard deviation of the posterior samples (shape: n_params).
    • message (str): Status message (e.g., success or failure reason).

Explanation

  1. Input Validation:

    • Converts x0, lb, ub, and sigma to NumPy arrays with float type.
    • Ensures lb, ub, and sigma have the same length as x0, returning an error dictionary if not.
    • Sets the random seed using np.random.seed(seed) for reproducibility.
  2. Initialization:

    • Initializes arrays for storing samples (samples) and log-posterior values (log_posterior) with size n_iter // thin.
    • Evaluates the initial log-likelihood (objecti_func(x0)) and log-prior (prior(x0)).
    • If the initial point is invalid, attempts up to 100 random samples within [lb, ub].
  3. Main MCMC Loop:

    • For each iteration:

      • Proposal: Generates a proposed state:

      • Evaluation: Computes log-likelihood + log-prior → log-posterior.

      • Acceptance: Computes probability:

      • Accept/Reject: Accepts proposal with probability .

      • Storage: After burn-in, stores every thin-th sample.

  4. Adaptive Step Size:

    • Adjusts sigma every 50 iterations during burn-in to target 20–50% acceptance rate.
  5. Output:

    • Computes acceptance rate, posterior mean, and standard deviation.
    • Returns dictionary with results and status message.
  6. Error Handling:

    • Catches numerical exceptions and returns error information.

Example

python
import numpy as np
from econometron.utils.sampler import rwm
from scipy.stats import norm

# Define log-likelihood and prior
def log_likelihood(x): return -0.5 * np.sum(x**2)  # Simple Gaussian likelihood
def log_prior(x): return norm.logpdf(x[0], 0, 1) + norm.logpdf(x[1], 0, 1)

# Run RWM
x0 = np.array([0.0, 0.0])
lb = np.array([-5.0, -5.0])
ub = np.array([5.0, 5.0])
result = rwm(log_likelihood, log_prior, x0, lb, ub, n_iter=10000, burn_in=1000, thin=2, sigma=0.5, seed=42, verbose=True)

print("Acceptance Rate:", result['acceptance_rate'])
print("Mean Posterior:", result['mean_posterior_parameters'])

Illustrative Output:

Iteration 0: Accepted, log_post = -1.234
Iteration 1000: Accept rate = 0.320, Updated sigma = [0.48, 0.48]
...
RWM Summary:
Acceptance rate: 0.312
Mean posterior parameters: [0.021, -0.015]
Std posterior parameters: [0.987, 0.992]

Notes

  • Designed to work with econometron.filters.kalman_objective for state-space models or other likelihood functions.
  • Uses symmetric normal proposal (Random Walk Metropolis).
  • Adaptive tuning during burn-in improves acceptance rate.
  • Verbose output prints acceptance rates and step sizes.
  • Applicable to any Bayesian model with log-likelihood and log-prior.

Limitations

  • Proposal step size (sigma) significantly affects mixing and convergence.
  • RWM may produce correlated samples; thinning mitigates but increases computation.
  • Random walk explores locally; inefficient for multimodal or high-dimensional posteriors.
  • No built-in convergence diagnostics; users must implement trace plots or statistics.
  • Choice of burn_in affects sample quality.

Dependencies

  • numpy (array operations, random numbers)
  • scipy.stats.norm (for log-prior examples)
  • econometron.utils.optimizers.evaluate_func (for log-likelihood evaluation)

Use Case in Context

Ideal for Bayesian estimation of econometric models such as DSGE models, using Kalman filter likelihoods and priors. Can be combined with genetic_algorithm_kalman or simulated_annealing_kalman to obtain posterior distributions. Posterior samples can be summarized with econometron.utils.estimation.results.create_results_table for credible intervals, plotting, and forecasting.