Skip to content

econometron.utils.data_preparation.process_timeseries

  • TransformTS class from econometron.utils.data_preparation.process_timeseries

Overview

TransformTS is a comprehensive time series transformation and analysis utility. It provides methods for:

  • Transforming series to achieve stationarity via differencing, log-differencing, Box-Cox, or Hodrick-Prescott filtering.
  • Performing stationarity checks using the Augmented Dickey-Fuller (ADF) test.
  • Applying inverse transformations to recover original scale data.
  • Conducting exploratory analysis, including summary statistics, correlation matrices, and optional ACF/PACF plots.

This class is particularly useful for preprocessing time series data before modeling, such as ARIMA, VAR, or state-space-based models, ensuring data is suitable for estimation or filtering.

Parameters

ParameterTypeDescriptionDefault
dataUnion[pd.DataFrame, pd.Series]Input time series data.
columnsOptional[List[str]]List of columns to transform. If None, all numeric columns are selected.None
methodstrTransformation method: 'diff', 'boxcox', 'log', 'log-diff', 'hp', 'inverse'.'diff'
demeanboolIf True, remove mean before transformation.True
analysisboolIf True, perform time series analysis (ADF test, correlation, summary).True
plotboolIf True, generate diagnostic plots (time series, ACF, PACF).False
lambfloatLambda parameter for Hodrick-Prescott filter.1600
log_databoolIf True, apply log transformation for 'log' or 'log-diff' methods when data is not already in log form.True
max_diffintMaximum differencing order before switching to log-diff for non-stationary series.2

Methods

_validate_inputs()

Validates the input data and parameters, checks for numeric columns, ensures method validity, and warns if NaNs are present.

_check_stationarity(series, col) -> bool

Performs ADF test to check if a series is stationary. Returns True if p-value < 0.05.

_check_stationarity_all()

Checks stationarity of all selected columns and stores results in self.stationary_status.

_check_if_log(series) -> bool

Heuristically determines if a series is likely in log form (positive values and reasonable range).

_make_stationary(series, col) -> pd.Series

Applies differencing until stationary, or switches to log-diff if over-differencing occurs.

transform() -> pd.DataFrame

Applies the specified transformation method to all selected columns:

  • 'diff': Differencing until stationarity.
  • 'boxcox': Box-Cox transformation (requires positive values).
  • 'log': Log transformation.
  • 'log-diff': Log followed by differencing.
  • 'hp': Hodrick-Prescott filter (extracts cyclical component).
  • 'inverse': Reverts transformed series to original scale.

Returns the transformed DataFrame.

_inverse_transform(series, col) -> pd.Series

Applies inverse transformation depending on method:

  • Cumulative sum for differencing.
  • Exponential or power formula for log/Box-Cox.

analyze()

Performs exploratory time series analysis, including:

  • Stationarity reporting (ADF test results).
  • Summary statistics.
  • NaN counts.
  • Correlation matrix (if multiple columns).
  • Optional plots (time series, ACF, PACF).

get_transformed_data() -> pd.DataFrame

Returns the transformed data, dropping NaNs.

trns_info() -> dict

Provides detailed transformation and stationarity info for each column, including:

  • Transformation method.
  • Differencing order.
  • Stationarity status and ADF statistics.
  • Log transformation and Box-Cox lambda info.
  • Original series stationarity.
  • Additional notes depending on method applied.

Example Usage

python
import pandas as pd
from transform_ts import TransformTS

# Sample time series
data = pd.DataFrame({
    'y': [1.2, 2.3, 3.5, 4.1, 5.2, 6.8],
    'x': [2.1, 2.9, 3.0, 3.5, 4.0, 4.8]
})

# Create transformer
ts_transformer = TransformTS(data, method='log-diff', analysis=True, plot=True)

# Access transformed data
transformed = ts_transformer.get_transformed_data()

# View transformation info
info = ts_transformer.trns_info()
print(info)

Notes

  • Automatically detects non-stationary series and applies appropriate transformations.
  • Handles non-positive values gracefully for log/Box-Cox methods.
  • Useful as a preprocessing step for econometric and machine learning time series models.
  • Stores all relevant information for reporting, diagnostics, and inverse transformations.