Convert Figma logo to code with AI

bashtage logoarch

ARCH models in Python

1,316
246
1,316
32

Top Related Projects

Statsmodels: statistical modeling and econometrics in Python

43,205

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

scikit-learn: machine learning in Python

12,892

SciPy library main repository

18,260

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

15,551

Topic Modelling for Humans

Quick Overview

ARCH (Autoregressive Conditional Heteroskedasticity) is a Python library for econometric modeling, with a focus on financial econometrics. It provides tools for modeling volatility, conducting hypothesis tests, and performing time series analysis, particularly useful for financial data and risk management.

Pros

  • Comprehensive suite of econometric tools, especially for volatility modeling
  • High-performance implementation using Cython and Numba
  • Extensive documentation and examples
  • Integrates well with the scientific Python ecosystem (NumPy, Pandas, etc.)

Cons

  • Steep learning curve for users not familiar with econometrics
  • Limited GUI or interactive tools, primarily command-line based
  • May be overkill for simple statistical analyses
  • Requires understanding of underlying statistical concepts for proper use

Code Examples

  1. Fitting a GARCH(1,1) model:
import arch
from arch import arch_model

returns = arch.util.returns("^GSPC")
model = arch_model(returns, vol="Garch", p=1, q=1)
results = model.fit()
print(results.summary())
  1. Forecasting volatility:
forecasts = results.forecast(horizon=5)
print(forecasts.variance.iloc[-1])
  1. Conducting a Ljung-Box test for autocorrelation:
from arch.unitroot import auto_bandwidth

lb_test = results.lbtest(lags=10, bandwidth=auto_bandwidth)
print(lb_test)

Getting Started

To get started with ARCH:

  1. Install the library:

    pip install arch
    
  2. Import the necessary modules:

    import arch
    from arch import arch_model
    
  3. Load or prepare your time series data (e.g., using Pandas).

  4. Create and fit a model:

    model = arch_model(your_data, vol="Garch", p=1, q=1)
    results = model.fit()
    
  5. Analyze the results and make forecasts as needed.

For more detailed instructions and examples, refer to the official documentation at https://arch.readthedocs.io/.

Competitor Comparisons

Statsmodels: statistical modeling and econometrics in Python

Pros of statsmodels

  • Broader scope, covering a wide range of statistical models and methods
  • More extensive documentation and examples
  • Larger community and more frequent updates

Cons of statsmodels

  • Can be slower for certain operations due to its comprehensive nature
  • May have a steeper learning curve for beginners

Code Comparison

statsmodels:

import statsmodels.api as sm
model = sm.OLS(y, X).fit()
predictions = model.predict(X_new)

arch:

from arch import arch_model
model = arch_model(returns).fit()
forecast = model.forecast()

Key Differences

  • statsmodels offers a wider range of statistical models, while arch focuses specifically on time series analysis and volatility modeling
  • arch is optimized for financial econometrics, providing specialized tools for ARCH/GARCH models
  • statsmodels has a more extensive API, which can be both an advantage and a drawback depending on the user's needs

Use Cases

  • Choose statsmodels for general statistical analysis and a variety of modeling techniques
  • Opt for arch when working specifically with financial time series data and volatility modeling

Community and Support

statsmodels has a larger user base and more frequent updates, potentially leading to better long-term support and feature development. However, arch's focused nature may result in more specialized support for its specific use cases.

43,205

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Pros of pandas

  • Broader scope and functionality for data manipulation and analysis
  • Larger community and more frequent updates
  • Extensive documentation and widespread adoption in data science

Cons of pandas

  • Steeper learning curve for beginners
  • Higher memory usage for large datasets
  • Can be slower for certain operations compared to specialized libraries

Code Comparison

pandas:

import pandas as pd

df = pd.read_csv('data.csv')
result = df.groupby('category').mean()

arch:

import arch

model = arch.arch_model(returns)
results = model.fit()

Summary

pandas is a versatile data manipulation library with broad functionality, while arch focuses specifically on econometric modeling and time series analysis. pandas offers more general-purpose tools for data handling, while arch provides specialized functions for financial and economic data analysis. The choice between the two depends on the specific requirements of your project and the type of data you're working with.

scikit-learn: machine learning in Python

Pros of scikit-learn

  • Broader scope, covering a wide range of machine learning algorithms and techniques
  • Larger community and more extensive documentation
  • More frequent updates and active development

Cons of scikit-learn

  • Less specialized for time series analysis and econometrics
  • May have a steeper learning curve for users focused on financial data analysis

Code Comparison

scikit-learn (Linear Regression):

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)
predictions = model.predict(X_test)

arch (ARCH Model):

from arch import arch_model
model = arch_model(returns)
results = model.fit()
forecast = results.forecast()

Summary

scikit-learn is a comprehensive machine learning library with a broad range of algorithms and techniques. It has a larger community and more extensive documentation, making it suitable for various data science tasks. However, it may not be as specialized for time series analysis and econometrics as arch.

arch focuses specifically on time series analysis, particularly for financial data. It provides specialized tools for modeling volatility and other econometric concepts. While it has a narrower scope, it may be more suitable for users working primarily with financial time series data.

The choice between the two depends on the specific requirements of the project and the user's familiarity with econometrics and time series analysis.

12,892

SciPy library main repository

Pros of SciPy

  • Broader scope, covering a wide range of scientific computing tasks
  • Larger community and more extensive documentation
  • More mature project with longer development history

Cons of SciPy

  • Can be slower for specific econometric tasks
  • Less focused on time series analysis and econometrics
  • May require additional dependencies for certain specialized functions

Code Comparison

SciPy example (linear regression):

from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)

ARCH example (GARCH model):

from arch import arch_model
model = arch_model(returns, vol='GARCH', p=1, q=1)
results = model.fit()

Summary

SciPy is a comprehensive scientific computing library with a broad range of applications, while ARCH is specialized for econometrics and time series analysis. SciPy offers more general-purpose tools, whereas ARCH provides focused functionality for financial modeling and volatility analysis. The choice between them depends on the specific requirements of your project and the depth of econometric analysis needed.

18,260

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Pros of Prophet

  • User-friendly interface for time series forecasting, requiring minimal data preprocessing
  • Handles seasonality and holidays automatically, making it easier for non-experts
  • Robust to missing data and outliers, reducing manual data cleaning efforts

Cons of Prophet

  • Less flexible for custom model specifications compared to ARCH
  • May not perform as well for financial time series with complex volatility patterns
  • Limited to additive models, which may not capture all types of time series behavior

Code Comparison

Prophet:

from fbprophet import Prophet
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)

ARCH:

from arch import arch_model
model = arch_model(returns, vol='GARCH', p=1, q=1)
results = model.fit()
forecast = results.forecast(horizon=5)

Prophet focuses on simplicity and automatic forecasting, while ARCH provides more control over model specification, particularly for volatility modeling in financial time series. Prophet is better suited for general forecasting tasks, while ARCH excels in analyzing and forecasting financial market volatility.

15,551

Topic Modelling for Humans

Pros of Gensim

  • Broader scope for natural language processing and topic modeling
  • Larger community and more extensive documentation
  • Better support for large-scale text processing and memory-efficient algorithms

Cons of Gensim

  • Steeper learning curve due to its wide range of functionalities
  • Less focused on time series analysis and econometrics
  • May be overkill for projects primarily dealing with financial data

Code Comparison

Gensim (topic modeling):

from gensim import corpora, models
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda_model = models.LdaMulticore(corpus=corpus, num_topics=10)

ARCH (time series modeling):

from arch import arch_model
model = arch_model(returns, vol='Garch', p=1, q=1)
results = model.fit()

Both libraries offer powerful tools for their respective domains. Gensim excels in text processing and topic modeling, while ARCH specializes in time series analysis and econometrics. The choice between them depends on the specific requirements of your project.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

arch

arch

Autoregressive Conditional Heteroskedasticity (ARCH) and other tools for financial econometrics, written in Python (with Cython and/or Numba used to improve performance)

Metric
Latest ReleasePyPI version
conda-forge version
Continuous IntegrationBuild Status
Coveragecodecov
Code QualityCodacy Badge
codebeat badge
CitationDOI
DocumentationDocumentation Status

Module Contents

Python 3

arch is Python 3 only. Version 4.8 is the final version that supported Python 2.7.

Documentation

Documentation from the main branch is hosted on my github pages.

Released documentation is hosted on read the docs.

More about ARCH

More information about ARCH and related models is available in the notes and research available at Kevin Sheppard's site.

Contributing

Contributions are welcome. There are opportunities at many levels to contribute:

  • Implement new volatility process, e.g., FIGARCH
  • Improve docstrings where unclear or with typos
  • Provide examples, preferably in the form of IPython notebooks

Examples

Volatility Modeling

  • Mean models
    • Constant mean
    • Heterogeneous Autoregression (HAR)
    • Autoregression (AR)
    • Zero mean
    • Models with and without exogenous regressors
  • Volatility models
    • ARCH
    • GARCH
    • TARCH
    • EGARCH
    • EWMA/RiskMetrics
  • Distributions
    • Normal
    • Student's T
    • Generalized Error Distribution

See the univariate volatility example notebook for a more complete overview.

import datetime as dt
import pandas_datareader.data as web
st = dt.datetime(1990,1,1)
en = dt.datetime(2014,1,1)
data = web.get_data_yahoo('^FTSE', start=st, end=en)
returns = 100 * data['Adj Close'].pct_change().dropna()

from arch import arch_model
am = arch_model(returns)
res = am.fit()

Unit Root Tests

  • Augmented Dickey-Fuller
  • Dickey-Fuller GLS
  • Phillips-Perron
  • KPSS
  • Zivot-Andrews
  • Variance Ratio tests

See the unit root testing example notebook for examples of testing series for unit roots.

Cointegration Testing and Analysis

  • Tests
    • Engle-Granger Test
    • Phillips-Ouliaris Test
  • Cointegration Vector Estimation
    • Canonical Cointegrating Regression
    • Dynamic OLS
    • Fully Modified OLS

See the cointegration testing example notebook for examples of testing series for cointegration.

Bootstrap

  • Bootstraps
    • IID Bootstrap
    • Stationary Bootstrap
    • Circular Block Bootstrap
    • Moving Block Bootstrap
  • Methods
    • Confidence interval construction
    • Covariance estimation
    • Apply method to estimate model across bootstraps
    • Generic Bootstrap iterator

See the bootstrap example notebook for examples of bootstrapping the Sharpe ratio and a Probit model from statsmodels.

# Import data
import datetime as dt
import pandas as pd
import numpy as np
import pandas_datareader.data as web
start = dt.datetime(1951,1,1)
end = dt.datetime(2014,1,1)
sp500 = web.get_data_yahoo('^GSPC', start=start, end=end)
start = sp500.index.min()
end = sp500.index.max()
monthly_dates = pd.date_range(start, end, freq='M')
monthly = sp500.reindex(monthly_dates, method='ffill')
returns = 100 * monthly['Adj Close'].pct_change().dropna()

# Function to compute parameters
def sharpe_ratio(x):
    mu, sigma = 12 * x.mean(), np.sqrt(12 * x.var())
    return np.array([mu, sigma, mu / sigma])

# Bootstrap confidence intervals
from arch.bootstrap import IIDBootstrap
bs = IIDBootstrap(returns)
ci = bs.conf_int(sharpe_ratio, 1000, method='percentile')

Multiple Comparison Procedures

  • Test of Superior Predictive Ability (SPA), also known as the Reality Check or Bootstrap Data Snooper
  • Stepwise (StepM)
  • Model Confidence Set (MCS)

See the multiple comparison example notebook for examples of the multiple comparison procedures.

Long-run Covariance Estimation

Kernel-based estimators of long-run covariance including the Bartlett kernel which is known as Newey-West in econometrics. Automatic bandwidth selection is available for all of the covariance estimators.

from arch.covariance.kernel import Bartlett
from arch.data import nasdaq
data = nasdaq.load()
returns = data[["Adj Close"]].pct_change().dropna()

cov_est = Bartlett(returns ** 2)
# Get the long-run covariance
cov_est.cov.long_run

Requirements

These requirements reflect the testing environment. It is possible that arch will work with older versions.

  • Python (3.9+)
  • NumPy (1.19+)
  • SciPy (1.5+)
  • Pandas (1.1+)
  • statsmodels (0.12+)
  • matplotlib (3+), optional

Optional Requirements

  • Numba (0.49+) will be used if available and when installed without building the binary modules. In order to ensure that these are not built, you must set the environment variable ARCH_NO_BINARY=1 and install without the wheel.
export ARCH_NO_BINARY=1
python -m pip install arch

or if using Powershell on windows

$env:ARCH_NO_BINARY=1
python -m pip install arch
  • jupyter and notebook are required to run the notebooks

Installing

Standard installation with a compiler requires Cython. If you do not have a compiler installed, the arch should still install. You will see a warning but this can be ignored. If you don't have a compiler, numba is strongly recommended.

pip

Releases are available PyPI and can be installed with pip.

pip install arch

You can alternatively install the latest version from GitHub

pip install git+https://github.com/bashtage/arch.git

Setting the environment variable ARCH_NO_BINARY=1 can be used to disable compilation of the extensions.

Anaconda

conda users can install from conda-forge,

conda install arch-py -c conda-forge

Note: The conda-forge name is arch-py.

Windows

Building extension using the community edition of Visual Studio is simple when using Python 3.8 or later. Building is not necessary when numba is installed since just-in-time compiled code (numba) runs as fast as ahead-of-time compiled extensions.

Developing

The development requirements are:

  • Cython (0.29+, if not using ARCH_NO_BINARY=1, supports 3.0.0b2+)
  • pytest (For tests)
  • sphinx (to build docs)
  • sphinx-immaterial (to build docs)
  • jupyter, notebook and nbsphinx (to build docs)

Installation Notes

  1. If Cython is not installed, the package will be installed as-if ARCH_NO_BINARY=1 was set.
  2. Setup does not verify these requirements. Please ensure these are installed.