Top Related Projects
Statsmodels: statistical modeling and econometrics in Python
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
scikit-learn: machine learning in Python
SciPy library main repository
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Topic Modelling for Humans
Quick Overview
ARCH (Autoregressive Conditional Heteroskedasticity) is a Python library for econometric modeling, with a focus on financial econometrics. It provides tools for modeling volatility, conducting hypothesis tests, and performing time series analysis, particularly useful for financial data and risk management.
- Comprehensive suite of econometric tools, especially for volatility modeling
- High-performance implementation using Cython and Numba
- Extensive documentation and examples
- Integrates well with the scientific Python ecosystem (NumPy, Pandas, etc.)
- Steep learning curve for users not familiar with econometrics
- Limited GUI or interactive tools, primarily command-line based
- May be overkill for simple statistical analyses
- Requires understanding of underlying statistical concepts for proper use
Code Examples
- Fitting a GARCH(1,1) model:
import arch
from arch import arch_model
returns = arch.util.returns("^GSPC")
model = arch_model(returns, vol="Garch", p=1, q=1)
results =
- Forecasting volatility:
forecasts = results.forecast(horizon=5)
- Conducting a Ljung-Box test for autocorrelation:
from arch.unitroot import auto_bandwidth
lb_test = results.lbtest(lags=10, bandwidth=auto_bandwidth)
Getting Started
To get started with ARCH:
Install the library:
pip install arch
Import the necessary modules:
import arch from arch import arch_model
Load or prepare your time series data (e.g., using Pandas).
Create and fit a model:
model = arch_model(your_data, vol="Garch", p=1, q=1) results =
Analyze the results and make forecasts as needed.
For more detailed instructions and examples, refer to the official documentation at
Competitor Comparisons
Statsmodels: statistical modeling and econometrics in Python
Pros of statsmodels
- Broader scope, covering a wide range of statistical models and methods
- More extensive documentation and examples
- Larger community and more frequent updates
Cons of statsmodels
- Can be slower for certain operations due to its comprehensive nature
- May have a steeper learning curve for beginners
Code Comparison
import statsmodels.api as sm
model = sm.OLS(y, X).fit()
predictions = model.predict(X_new)
from arch import arch_model
model = arch_model(returns).fit()
forecast = model.forecast()
Key Differences
- statsmodels offers a wider range of statistical models, while arch focuses specifically on time series analysis and volatility modeling
- arch is optimized for financial econometrics, providing specialized tools for ARCH/GARCH models
- statsmodels has a more extensive API, which can be both an advantage and a drawback depending on the user's needs
Use Cases
- Choose statsmodels for general statistical analysis and a variety of modeling techniques
- Opt for arch when working specifically with financial time series data and volatility modeling
Community and Support
statsmodels has a larger user base and more frequent updates, potentially leading to better long-term support and feature development. However, arch's focused nature may result in more specialized support for its specific use cases.
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Pros of pandas
- Broader scope and functionality for data manipulation and analysis
- Larger community and more frequent updates
- Extensive documentation and widespread adoption in data science
Cons of pandas
- Steeper learning curve for beginners
- Higher memory usage for large datasets
- Can be slower for certain operations compared to specialized libraries
Code Comparison
import pandas as pd
df = pd.read_csv('data.csv')
result = df.groupby('category').mean()
import arch
model = arch.arch_model(returns)
results =
pandas is a versatile data manipulation library with broad functionality, while arch focuses specifically on econometric modeling and time series analysis. pandas offers more general-purpose tools for data handling, while arch provides specialized functions for financial and economic data analysis. The choice between the two depends on the specific requirements of your project and the type of data you're working with.
scikit-learn: machine learning in Python
Pros of scikit-learn
- Broader scope, covering a wide range of machine learning algorithms and techniques
- Larger community and more extensive documentation
- More frequent updates and active development
Cons of scikit-learn
- Less specialized for time series analysis and econometrics
- May have a steeper learning curve for users focused on financial data analysis
Code Comparison
scikit-learn (Linear Regression):
from sklearn.linear_model import LinearRegression
model = LinearRegression(), y)
predictions = model.predict(X_test)
arch (ARCH Model):
from arch import arch_model
model = arch_model(returns)
results =
forecast = results.forecast()
scikit-learn is a comprehensive machine learning library with a broad range of algorithms and techniques. It has a larger community and more extensive documentation, making it suitable for various data science tasks. However, it may not be as specialized for time series analysis and econometrics as arch.
arch focuses specifically on time series analysis, particularly for financial data. It provides specialized tools for modeling volatility and other econometric concepts. While it has a narrower scope, it may be more suitable for users working primarily with financial time series data.
The choice between the two depends on the specific requirements of the project and the user's familiarity with econometrics and time series analysis.
SciPy library main repository
Pros of SciPy
- Broader scope, covering a wide range of scientific computing tasks
- Larger community and more extensive documentation
- More mature project with longer development history
Cons of SciPy
- Can be slower for specific econometric tasks
- Less focused on time series analysis and econometrics
- May require additional dependencies for certain specialized functions
Code Comparison
SciPy example (linear regression):
from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
ARCH example (GARCH model):
from arch import arch_model
model = arch_model(returns, vol='GARCH', p=1, q=1)
results =
SciPy is a comprehensive scientific computing library with a broad range of applications, while ARCH is specialized for econometrics and time series analysis. SciPy offers more general-purpose tools, whereas ARCH provides focused functionality for financial modeling and volatility analysis. The choice between them depends on the specific requirements of your project and the depth of econometric analysis needed.
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Pros of Prophet
- User-friendly interface for time series forecasting, requiring minimal data preprocessing
- Handles seasonality and holidays automatically, making it easier for non-experts
- Robust to missing data and outliers, reducing manual data cleaning efforts
Cons of Prophet
- Less flexible for custom model specifications compared to ARCH
- May not perform as well for financial time series with complex volatility patterns
- Limited to additive models, which may not capture all types of time series behavior
Code Comparison
from fbprophet import Prophet
model = Prophet()
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)
from arch import arch_model
model = arch_model(returns, vol='GARCH', p=1, q=1)
results =
forecast = results.forecast(horizon=5)
Prophet focuses on simplicity and automatic forecasting, while ARCH provides more control over model specification, particularly for volatility modeling in financial time series. Prophet is better suited for general forecasting tasks, while ARCH excels in analyzing and forecasting financial market volatility.
Topic Modelling for Humans
Pros of Gensim
- Broader scope for natural language processing and topic modeling
- Larger community and more extensive documentation
- Better support for large-scale text processing and memory-efficient algorithms
Cons of Gensim
- Steeper learning curve due to its wide range of functionalities
- Less focused on time series analysis and econometrics
- May be overkill for projects primarily dealing with financial data
Code Comparison
Gensim (topic modeling):
from gensim import corpora, models
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda_model = models.LdaMulticore(corpus=corpus, num_topics=10)
ARCH (time series modeling):
from arch import arch_model
model = arch_model(returns, vol='Garch', p=1, q=1)
results =
Both libraries offer powerful tools for their respective domains. Gensim excels in text processing and topic modeling, while ARCH specializes in time series analysis and econometrics. The choice between them depends on the specific requirements of your project.
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Autoregressive Conditional Heteroskedasticity (ARCH) and other tools for financial econometrics, written in Python (with Cython and/or Numba used to improve performance)
Metric | |
Latest Release | |
Continuous Integration | |
Coverage | |
Code Quality | |
Citation | |
Documentation |
Module Contents
- Univariate ARCH Models
- Unit Root Tests
- Cointegration Testing and Analysis
- Bootstrapping
- Multiple Comparison Tests
- Long-run Covariance Estimation
Python 3
is Python 3 only. Version 4.8 is the final version that supported Python 2.7.
Documentation from the main branch is hosted on my github pages.
Released documentation is hosted on read the docs.
More about ARCH
More information about ARCH and related models is available in the notes and research available at Kevin Sheppard's site.
Contributions are welcome. There are opportunities at many levels to contribute:
- Implement new volatility process, e.g., FIGARCH
- Improve docstrings where unclear or with typos
- Provide examples, preferably in the form of IPython notebooks
Volatility Modeling
- Mean models
- Constant mean
- Heterogeneous Autoregression (HAR)
- Autoregression (AR)
- Zero mean
- Models with and without exogenous regressors
- Volatility models
- EWMA/RiskMetrics
- Distributions
- Normal
- Student's T
- Generalized Error Distribution
See the univariate volatility example notebook for a more complete overview.
import datetime as dt
import as web
st = dt.datetime(1990,1,1)
en = dt.datetime(2014,1,1)
data = web.get_data_yahoo('^FTSE', start=st, end=en)
returns = 100 * data['Adj Close'].pct_change().dropna()
from arch import arch_model
am = arch_model(returns)
res =
Unit Root Tests
- Augmented Dickey-Fuller
- Dickey-Fuller GLS
- Phillips-Perron
- Zivot-Andrews
- Variance Ratio tests
See the unit root testing example notebook for examples of testing series for unit roots.
Cointegration Testing and Analysis
- Tests
- Engle-Granger Test
- Phillips-Ouliaris Test
- Cointegration Vector Estimation
- Canonical Cointegrating Regression
- Dynamic OLS
- Fully Modified OLS
See the cointegration testing example notebook for examples of testing series for cointegration.
- Bootstraps
- IID Bootstrap
- Stationary Bootstrap
- Circular Block Bootstrap
- Moving Block Bootstrap
- Methods
- Confidence interval construction
- Covariance estimation
- Apply method to estimate model across bootstraps
- Generic Bootstrap iterator
See the bootstrap example notebook for examples of bootstrapping the Sharpe ratio and a Probit model from statsmodels.
# Import data
import datetime as dt
import pandas as pd
import numpy as np
import as web
start = dt.datetime(1951,1,1)
end = dt.datetime(2014,1,1)
sp500 = web.get_data_yahoo('^GSPC', start=start, end=end)
start = sp500.index.min()
end = sp500.index.max()
monthly_dates = pd.date_range(start, end, freq='M')
monthly = sp500.reindex(monthly_dates, method='ffill')
returns = 100 * monthly['Adj Close'].pct_change().dropna()
# Function to compute parameters
def sharpe_ratio(x):
mu, sigma = 12 * x.mean(), np.sqrt(12 * x.var())
return np.array([mu, sigma, mu / sigma])
# Bootstrap confidence intervals
from arch.bootstrap import IIDBootstrap
bs = IIDBootstrap(returns)
ci = bs.conf_int(sharpe_ratio, 1000, method='percentile')
Multiple Comparison Procedures
- Test of Superior Predictive Ability (SPA), also known as the Reality Check or Bootstrap Data Snooper
- Stepwise (StepM)
- Model Confidence Set (MCS)
See the multiple comparison example notebook for examples of the multiple comparison procedures.
Long-run Covariance Estimation
Kernel-based estimators of long-run covariance including the Bartlett kernel which is known as Newey-West in econometrics. Automatic bandwidth selection is available for all of the covariance estimators.
from arch.covariance.kernel import Bartlett
from import nasdaq
data = nasdaq.load()
returns = data[["Adj Close"]].pct_change().dropna()
cov_est = Bartlett(returns ** 2)
# Get the long-run covariance
These requirements reflect the testing environment. It is possible that arch will work with older versions.
- Python (3.9+)
- NumPy (1.19+)
- SciPy (1.5+)
- Pandas (1.1+)
- statsmodels (0.12+)
- matplotlib (3+), optional
Optional Requirements
- Numba (0.49+) will be used if available and when installed without building the binary modules. In order to ensure that these are not built, you must set the environment variable
and install without the wheel.
python -m pip install arch
or if using Powershell on windows
python -m pip install arch
- jupyter and notebook are required to run the notebooks
Standard installation with a compiler requires Cython. If you do not
have a compiler installed, the arch
should still install. You will
see a warning but this can be ignored. If you don't have a compiler,
is strongly recommended.
Releases are available PyPI and can be installed with pip
pip install arch
You can alternatively install the latest version from GitHub
pip install git+
Setting the environment variable ARCH_NO_BINARY=1
can be used to
disable compilation of the extensions.
users can install from conda-forge,
conda install arch-py -c conda-forge
Note: The conda-forge name is arch-py
Building extension using the community edition of Visual Studio is simple when using Python 3.8 or later. Building is not necessary when numba is installed since just-in-time compiled code (numba) runs as fast as ahead-of-time compiled extensions.
The development requirements are:
- Cython (0.29+, if not using ARCH_NO_BINARY=1, supports 3.0.0b2+)
- pytest (For tests)
- sphinx (to build docs)
- sphinx-immaterial (to build docs)
- jupyter, notebook and nbsphinx (to build docs)
Installation Notes
- If Cython is not installed, the package will be installed
was set. - Setup does not verify these requirements. Please ensure these are installed.
Top Related Projects
Statsmodels: statistical modeling and econometrics in Python
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
scikit-learn: machine learning in Python
SciPy library main repository
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Topic Modelling for Humans
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot