arch

ARCH models in Python

1,413

265

1,413

View on GitHub

Top Related Projects

statsmodels

10,845

Statsmodels: statistical modeling and econometrics in Python

pandas

46,175

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

scikit-learn

62,466

scikit-learn: machine learning in Python

prophet

19,467

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Quick Overview

ARCH (Autoregressive Conditional Heteroskedasticity) is a Python library for econometric modeling, with a focus on financial econometrics. It provides tools for modeling volatility, conducting hypothesis tests, and performing time series analysis, particularly useful for financial data and risk management.

Pros

Comprehensive suite of econometric tools, especially for volatility modeling
High-performance implementation using Cython and Numba
Extensive documentation and examples
Integrates well with the scientific Python ecosystem (NumPy, Pandas, etc.)

Cons

Steep learning curve for users not familiar with econometrics
Limited GUI or interactive tools, primarily command-line based
May be overkill for simple statistical analyses
Requires understanding of underlying statistical concepts for proper use

Code Examples

Fitting a GARCH(1,1) model:

import arch
from arch import arch_model

returns = arch.util.returns("^GSPC")
model = arch_model(returns, vol="Garch", p=1, q=1)
results = model.fit()
print(results.summary())

Forecasting volatility:

forecasts = results.forecast(horizon=5)
print(forecasts.variance.iloc[-1])

Conducting a Ljung-Box test for autocorrelation:

from arch.unitroot import auto_bandwidth

lb_test = results.lbtest(lags=10, bandwidth=auto_bandwidth)
print(lb_test)

Getting Started

To get started with ARCH:

Install the library:
```
pip install arch
```

Import the necessary modules:

import arch
from arch import arch_model

Load or prepare your time series data (e.g., using Pandas).

Create and fit a model:

model = arch_model(your_data, vol="Garch", p=1, q=1)
results = model.fit()

Analyze the results and make forecasts as needed.

For more detailed instructions and examples, refer to the official documentation at https://arch.readthedocs.io/.

Competitor Comparisons

statsmodels

10,845

Statsmodels: statistical modeling and econometrics in Python

Pros of statsmodels

Broader scope, covering a wide range of statistical models and methods
More extensive documentation and examples
Larger community and more frequent updates

Cons of statsmodels

Can be slower for certain operations due to its comprehensive nature
May have a steeper learning curve for beginners

Code Comparison

statsmodels:

import statsmodels.api as sm
model = sm.OLS(y, X).fit()
predictions = model.predict(X_new)

arch:

from arch import arch_model
model = arch_model(returns).fit()
forecast = model.forecast()

Key Differences

statsmodels offers a wider range of statistical models, while arch focuses specifically on time series analysis and volatility modeling
arch is optimized for financial econometrics, providing specialized tools for ARCH/GARCH models
statsmodels has a more extensive API, which can be both an advantage and a drawback depending on the user's needs

Use Cases

Choose statsmodels for general statistical analysis and a variety of modeling techniques
Opt for arch when working specifically with financial time series data and volatility modeling

Community and Support

statsmodels has a larger user base and more frequent updates, potentially leading to better long-term support and feature development. However, arch's focused nature may result in more specialized support for its specific use cases.

pandas

46,175

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Pros of pandas

Broader scope and functionality for data manipulation and analysis
Larger community and more frequent updates
Extensive documentation and widespread adoption in data science

Cons of pandas

Steeper learning curve for beginners
Higher memory usage for large datasets
Can be slower for certain operations compared to specialized libraries

Code Comparison

pandas:

import pandas as pd

df = pd.read_csv('data.csv')
result = df.groupby('category').mean()

arch:

import arch

model = arch.arch_model(returns)
results = model.fit()

Summary

pandas is a versatile data manipulation library with broad functionality, while arch focuses specifically on econometric modeling and time series analysis. pandas offers more general-purpose tools for data handling, while arch provides specialized functions for financial and economic data analysis. The choice between the two depends on the specific requirements of your project and the type of data you're working with.

scikit-learn

62,466

scikit-learn: machine learning in Python

Pros of scikit-learn

Broader scope, covering a wide range of machine learning algorithms and techniques
Larger community and more extensive documentation
More frequent updates and active development

Cons of scikit-learn

Less specialized for time series analysis and econometrics
May have a steeper learning curve for users focused on financial data analysis

Code Comparison

scikit-learn (Linear Regression):

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)
predictions = model.predict(X_test)

arch (ARCH Model):

from arch import arch_model
model = arch_model(returns)
results = model.fit()
forecast = results.forecast()

Summary

scikit-learn is a comprehensive machine learning library with a broad range of algorithms and techniques. It has a larger community and more extensive documentation, making it suitable for various data science tasks. However, it may not be as specialized for time series analysis and econometrics as arch.

arch focuses specifically on time series analysis, particularly for financial data. It provides specialized tools for modeling volatility and other econometric concepts. While it has a narrower scope, it may be more suitable for users working primarily with financial time series data.

The choice between the two depends on the specific requirements of the project and the user's familiarity with econometrics and time series analysis.

scipy

13,853

SciPy library main repository

Pros of SciPy

Broader scope, covering a wide range of scientific computing tasks
Larger community and more extensive documentation
More mature project with longer development history

Cons of SciPy

Can be slower for specific econometric tasks
Less focused on time series analysis and econometrics
May require additional dependencies for certain specialized functions

Code Comparison

SciPy example (linear regression):

from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)

ARCH example (GARCH model):

from arch import arch_model
model = arch_model(returns, vol='GARCH', p=1, q=1)
results = model.fit()

Summary

SciPy is a comprehensive scientific computing library with a broad range of applications, while ARCH is specialized for econometrics and time series analysis. SciPy offers more general-purpose tools, whereas ARCH provides focused functionality for financial modeling and volatility analysis. The choice between them depends on the specific requirements of your project and the depth of econometric analysis needed.

prophet

19,467

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Pros of Prophet

User-friendly interface for time series forecasting, requiring minimal data preprocessing
Handles seasonality and holidays automatically, making it easier for non-experts
Robust to missing data and outliers, reducing manual data cleaning efforts

Cons of Prophet

Less flexible for custom model specifications compared to ARCH
May not perform as well for financial time series with complex volatility patterns
Limited to additive models, which may not capture all types of time series behavior

Code Comparison

Prophet:

from fbprophet import Prophet
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)

ARCH:

from arch import arch_model
model = arch_model(returns, vol='GARCH', p=1, q=1)
results = model.fit()
forecast = results.forecast(horizon=5)

Prophet focuses on simplicity and automatic forecasting, while ARCH provides more control over model specification, particularly for volatility modeling in financial time series. Prophet is better suited for general forecasting tasks, while ARCH excels in analyzing and forecasting financial market volatility.

gensim

16,122

Topic Modelling for Humans

Pros of Gensim

Broader scope for natural language processing and topic modeling
Larger community and more extensive documentation
Better support for large-scale text processing and memory-efficient algorithms

Cons of Gensim

Steeper learning curve due to its wide range of functionalities
Less focused on time series analysis and econometrics
May be overkill for projects primarily dealing with financial data

Code Comparison

Gensim (topic modeling):

from gensim import corpora, models
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda_model = models.LdaMulticore(corpus=corpus, num_topics=10)

ARCH (time series modeling):

from arch import arch_model
model = arch_model(returns, vol='Garch', p=1, q=1)
results = model.fit()

Both libraries offer powerful tools for their respective domains. Gensim excels in text processing and topic modeling, while ARCH specializes in time series analysis and econometrics. The choice between them depends on the specific requirements of your project.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

arch

Autoregressive Conditional Heteroskedasticity (ARCH) and other tools for financial econometrics, written in Python (with Cython and/or Numba used to improve performance)

Metric
Latest Release

Continuous Integration
Coverage
Code Quality

Citation
Documentation

Python 3

arch is Python 3 only. Version 4.8 is the final version that supported Python 2.7.

Documentation

Documentation from the main branch is hosted on my github pages.

Released documentation is hosted on read the docs.

More about ARCH

More information about ARCH and related models is available in the notes and research available at Kevin Sheppard's site.

Contributing

Contributions are welcome. There are opportunities at many levels to contribute:

Implement new volatility process, e.g., FIGARCH
Improve docstrings where unclear or with typos
Provide examples, preferably in the form of IPython notebooks

Examples

Volatility Modeling

Mean models
- Constant mean
- Heterogeneous Autoregression (HAR)
- Autoregression (AR)
- Zero mean
- Models with and without exogenous regressors
Volatility models
- ARCH
- GARCH
- TARCH
- EGARCH
- EWMA/RiskMetrics
Distributions
- Normal
- Student's T
- Generalized Error Distribution

See the univariate volatility example notebook for a more complete overview.

import datetime as dt
import pandas_datareader.data as web
st = dt.datetime(1990,1,1)
en = dt.datetime(2014,1,1)
data = web.get_data_yahoo('^FTSE', start=st, end=en)
returns = 100 * data['Adj Close'].pct_change().dropna()

from arch import arch_model
am = arch_model(returns)
res = am.fit()

Unit Root Tests

Augmented Dickey-Fuller
Dickey-Fuller GLS
Phillips-Perron
KPSS
Zivot-Andrews
Variance Ratio tests

See the unit root testing example notebook for examples of testing series for unit roots.

Cointegration Testing and Analysis

Tests
- Engle-Granger Test
- Phillips-Ouliaris Test
Cointegration Vector Estimation
- Canonical Cointegrating Regression
- Dynamic OLS
- Fully Modified OLS

See the cointegration testing example notebook for examples of testing series for cointegration.

Bootstrap

Bootstraps
- IID Bootstrap
- Stationary Bootstrap
- Circular Block Bootstrap
- Moving Block Bootstrap
Methods
- Confidence interval construction
- Covariance estimation
- Apply method to estimate model across bootstraps
- Generic Bootstrap iterator

See the bootstrap example notebook for examples of bootstrapping the Sharpe ratio and a Probit model from statsmodels.

# Import data
import datetime as dt
import pandas as pd
import numpy as np
import pandas_datareader.data as web
start = dt.datetime(1951,1,1)
end = dt.datetime(2014,1,1)
sp500 = web.get_data_yahoo('^GSPC', start=start, end=end)
start = sp500.index.min()
end = sp500.index.max()
monthly_dates = pd.date_range(start, end, freq='M')
monthly = sp500.reindex(monthly_dates, method='ffill')
returns = 100 * monthly['Adj Close'].pct_change().dropna()

# Function to compute parameters
def sharpe_ratio(x):
    mu, sigma = 12 * x.mean(), np.sqrt(12 * x.var())
    return np.array([mu, sigma, mu / sigma])

# Bootstrap confidence intervals
from arch.bootstrap import IIDBootstrap
bs = IIDBootstrap(returns)
ci = bs.conf_int(sharpe_ratio, 1000, method='percentile')

Multiple Comparison Procedures

Test of Superior Predictive Ability (SPA), also known as the Reality Check or Bootstrap Data Snooper
Stepwise (StepM)
Model Confidence Set (MCS)

See the multiple comparison example notebook for examples of the multiple comparison procedures.

Long-run Covariance Estimation

Kernel-based estimators of long-run covariance including the Bartlett kernel which is known as Newey-West in econometrics. Automatic bandwidth selection is available for all of the covariance estimators.

from arch.covariance.kernel import Bartlett
from arch.data import nasdaq
data = nasdaq.load()
returns = data[["Adj Close"]].pct_change().dropna()

cov_est = Bartlett(returns ** 2)
# Get the long-run covariance
cov_est.cov.long_run

Requirements

These requirements reflect the testing environment. It is possible that arch will work with older versions.

Python (3.9+)
NumPy (1.19+)
SciPy (1.5+)
Pandas (1.1+)
statsmodels (0.12+)
matplotlib (3+), optional

Optional Requirements

Numba (0.49+) will be used if available and when installed without building the binary modules. In order to ensure that these are not built, you must set the environment variable ARCH_NO_BINARY=1 and install without the wheel.

export ARCH_NO_BINARY=1
python -m pip install arch

or if using Powershell on windows

$env:ARCH_NO_BINARY=1
python -m pip install arch

jupyter and notebook are required to run the notebooks

Installing

Standard installation with a compiler requires Cython. If you do not have a compiler installed, the arch should still install. You will see a warning but this can be ignored. If you don't have a compiler, numba is strongly recommended.

pip

Releases are available PyPI and can be installed with pip.

pip install arch

You can alternatively install the latest version from GitHub

pip install git+https://github.com/bashtage/arch.git

Setting the environment variable ARCH_NO_BINARY=1 can be used to disable compilation of the extensions.

Anaconda

conda users can install from conda-forge,

conda install arch-py -c conda-forge

Note: The conda-forge name is arch-py.

Windows

Building extension using the community edition of Visual Studio is simple when using Python 3.8 or later. Building is not necessary when numba is installed since just-in-time compiled code (numba) runs as fast as ahead-of-time compiled extensions.

Developing

The development requirements are:

Cython (0.29+, if not using ARCH_NO_BINARY=1, supports 3.0.0b2+)
pytest (For tests)
sphinx (to build docs)
sphinx-immaterial (to build docs)
jupyter, notebook and nbsphinx (to build docs)

Installation Notes

If Cython is not installed, the package will be installed as-if ARCH_NO_BINARY=1 was set.
Setup does not verify these requirements. Please ensure these are installed.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot