Top Related Projects
Statistical data visualization in Python
The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
matplotlib: plotting with Python
Interactive Data Visualization in the browser, from Python
Declarative statistical visualization library for Python
With Holoviews, your data visualizes itself.
Quick Overview
Plotnine is a Python library for creating static, animated, and interactive graphics based on The Grammar of Graphics. It is an implementation of a grammar of graphics in Python, similar to ggplot2 in R. Plotnine makes it easy to create complex plots from data in dataframes.
Pros
- Familiar syntax for R users, making it easier to transition from R to Python for data visualization
- Highly customizable and flexible, allowing for creation of complex and publication-quality plots
- Integrates well with pandas DataFrames, making it convenient for data analysis workflows
- Extensive documentation and examples available
Cons
- Performance can be slower compared to matplotlib for large datasets
- Less widespread adoption compared to matplotlib in the Python ecosystem
- Steeper learning curve for users not familiar with the Grammar of Graphics concept
- Limited interactive plotting capabilities compared to some other Python libraries
Code Examples
Creating a basic scatter plot:
from plotnine import ggplot, aes, geom_point
import pandas as pd
df = pd.DataFrame({'x': [1, 2, 3, 4, 5], 'y': [2, 4, 6, 8, 10]})
(ggplot(df, aes(x='x', y='y')) + geom_point())
Adding multiple layers and customizing the plot:
from plotnine import ggplot, aes, geom_point, geom_smooth, labs, theme_minimal
(ggplot(df, aes(x='x', y='y'))
+ geom_point()
+ geom_smooth(method='lm')
+ labs(title='Scatter Plot with Trend Line', x='X-axis', y='Y-axis')
+ theme_minimal())
Creating a faceted plot:
from plotnine import ggplot, aes, geom_bar, facet_wrap
df = pd.DataFrame({'category': ['A', 'B', 'C'] * 4, 'group': ['X', 'Y'] * 6, 'value': range(12)})
(ggplot(df, aes(x='category', y='value', fill='group'))
+ geom_bar(stat='identity', position='dodge')
+ facet_wrap('~group'))
Getting Started
To get started with plotnine, first install it using pip:
pip install plotnine
Then, import the necessary components and create a simple plot:
from plotnine import ggplot, aes, geom_point
import pandas as pd
# Create a sample dataframe
df = pd.DataFrame({'x': [1, 2, 3, 4, 5], 'y': [2, 4, 6, 8, 10]})
# Create and display a basic scatter plot
(ggplot(df, aes(x='x', y='y')) + geom_point())
This will create a basic scatter plot using the data from the DataFrame. You can then build upon this example by adding more layers, customizing aesthetics, and exploring other geoms and statistical transformations available in plotnine.
Competitor Comparisons
Statistical data visualization in Python
Pros of seaborn
- More mature and widely adopted in the data science community
- Extensive built-in statistical functions and plot types
- Seamless integration with pandas DataFrames
Cons of seaborn
- Less flexible for customizing plot aesthetics
- Steeper learning curve for advanced customization
- Limited support for faceting (multi-plot layouts)
Code comparison
seaborn:
import seaborn as sns
import matplotlib.pyplot as plt
sns.scatterplot(x='x', y='y', hue='category', data=df)
plt.title('Scatter Plot')
plt.show()
plotnine:
from plotnine import ggplot, aes, geom_point, ggtitle
(ggplot(df, aes(x='x', y='y', color='category'))
+ geom_point()
+ ggtitle('Scatter Plot'))
Key differences
- plotnine follows the Grammar of Graphics principles, similar to R's ggplot2
- seaborn is built on top of matplotlib, while plotnine is a separate implementation
- plotnine offers a more declarative approach to plotting
- seaborn provides more built-in themes and color palettes
Both libraries have their strengths and are suitable for different use cases. seaborn excels in quick statistical visualizations, while plotnine offers more flexibility for complex, customized plots.
The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
Pros of plotly.py
- Interactive and dynamic visualizations with zooming, panning, and hover tooltips
- Supports both web-based and offline plotting
- Wide range of chart types and customization options
Cons of plotly.py
- Steeper learning curve compared to plotnine's ggplot-like syntax
- Can be slower for large datasets due to JavaScript rendering
- More verbose code for simple plots
Code Comparison
plotly.py:
import plotly.graph_objects as go
fig = go.Figure(data=go.Scatter(x=[1, 2, 3], y=[4, 5, 6]))
fig.show()
plotnine:
from plotnine import ggplot, aes, geom_point
ggplot(data, aes(x='x', y='y')) + geom_point()
plotnine offers a more concise and familiar syntax for those coming from R's ggplot2, while plotly.py provides more interactive features at the cost of verbosity. plotnine is better suited for quick, static visualizations, whereas plotly.py excels in creating interactive, web-ready plots with extensive customization options.
matplotlib: plotting with Python
Pros of matplotlib
- Extensive documentation and large community support
- Highly customizable with fine-grained control over plot elements
- Wide range of plot types and styles available
Cons of matplotlib
- Steeper learning curve for beginners
- Less intuitive syntax compared to ggplot-style grammar
- Requires more code for complex visualizations
Code Comparison
matplotlib:
import matplotlib.pyplot as plt
plt.figure(figsize=(8, 6))
plt.scatter(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
plt.show()
plotnine:
from plotnine import ggplot, aes, geom_point
(ggplot(data, aes(x='x', y='y'))
+ geom_point()
+ labs(x='X-axis', y='Y-axis', title='Scatter Plot')
)
Summary
matplotlib is a powerful and flexible plotting library with extensive capabilities, but it can be more challenging for beginners. plotnine offers a more intuitive, ggplot-style syntax that may be easier for those familiar with R's ggplot2. While matplotlib provides more fine-grained control, plotnine allows for quicker creation of aesthetically pleasing plots with less code.
Interactive Data Visualization in the browser, from Python
Pros of Bokeh
- Interactive visualizations: Bokeh excels at creating interactive plots and dashboards for web browsers
- Flexibility: Supports a wide range of chart types and customization options
- Large-scale data handling: Better suited for visualizing large datasets
Cons of Bokeh
- Steeper learning curve: More complex API compared to plotnine's ggplot-like syntax
- Less suitable for static plots: While possible, creating static plots is not Bokeh's primary focus
- Requires more code: Often needs more lines of code to create similar visualizations
Code Comparison
Bokeh:
from bokeh.plotting import figure, show
p = figure(title="Simple Line Plot")
p.line([1, 2, 3, 4, 5], [6, 7, 2, 4, 5])
show(p)
plotnine:
from plotnine import ggplot, aes, geom_line
ggplot(aes(x=[1, 2, 3, 4, 5], y=[6, 7, 2, 4, 5])) + geom_line()
Both libraries offer powerful data visualization capabilities, but they cater to different use cases. Bokeh is ideal for interactive, web-based visualizations and handling large datasets, while plotnine provides a more familiar syntax for those coming from R's ggplot2 and is better suited for quick, static plots.
Declarative statistical visualization library for Python
Pros of Altair
- Declarative approach allows for more concise and expressive code
- Seamless integration with Jupyter notebooks and web-based environments
- Extensive documentation and examples available
Cons of Altair
- Steeper learning curve for users familiar with matplotlib-style syntax
- Limited customization options for fine-grained control over plot elements
Code Comparison
Altair:
import altair as alt
from vega_datasets import data
chart = alt.Chart(data.cars()).mark_point().encode(
x='Horsepower',
y='Miles_per_Gallon',
color='Origin'
)
Plotnine:
from plotnine import ggplot, aes, geom_point
from plotnine.data import mtcars
(ggplot(mtcars, aes('hp', 'mpg', color='factor(cyl)'))
+ geom_point())
Both libraries offer powerful data visualization capabilities, with Altair focusing on a declarative approach and web integration, while Plotnine provides a familiar ggplot2-like syntax for Python users. Altair excels in interactive and web-based visualizations, while Plotnine offers more fine-grained control over plot elements. The choice between the two depends on the user's preferred syntax, project requirements, and target environment.
With Holoviews, your data visualizes itself.
Pros of HoloViews
- More flexible and powerful for complex, interactive visualizations
- Better support for large datasets and streaming data
- Seamless integration with other HoloViz tools like Panel and Datashader
Cons of HoloViews
- Steeper learning curve compared to Plotnine's simpler API
- Less adherence to the familiar Grammar of Graphics syntax
- May be overkill for simple, static plots
Code Comparison
Plotnine example:
from plotnine import ggplot, aes, geom_point
ggplot(data, aes(x='x', y='y')) + geom_point()
HoloViews example:
import holoviews as hv
hv.Points(data, kdims=['x'], vdims=['y'])
Both libraries offer concise ways to create visualizations, but HoloViews uses a different paradigm that may be less familiar to those coming from ggplot2 or Plotnine. HoloViews focuses on declarative data structures, while Plotnine follows the layered Grammar of Graphics approach.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
plotnine
plotnine is an implementation of a grammar of graphics in Python based on ggplot2. The grammar allows you to compose plots by explicitly mapping variables in a dataframe to the visual characteristics (position, color, size etc.) of objects that make up the plot.
Plotting with a grammar of graphics is powerful. Custom (and otherwise complex) plots are easy to think about and build incrementally, while the simple plots remain simple to create.
To learn more about how to use plotnine, check out the documentation. Since plotnine has an API similar to ggplot2, where it lacks in coverage the ggplot2 documentation may be helpful.
Example
from plotnine import *
from plotnine.data import mtcars
Building a complex plot piece by piece.
-
Scatter plot
( ggplot(mtcars, aes("wt", "mpg")) + geom_point() )
-
Scatter plot colored according some variable
( ggplot(mtcars, aes("wt", "mpg", color="factor(gear)")) + geom_point() )
-
Scatter plot colored according some variable and smoothed with a linear model with confidence intervals.
( ggplot(mtcars, aes("wt", "mpg", color="factor(gear)")) + geom_point() + stat_smooth(method="lm") )
-
Scatter plot colored according some variable, smoothed with a linear model with confidence intervals and plotted on separate panels.
( ggplot(mtcars, aes("wt", "mpg", color="factor(gear)")) + geom_point() + stat_smooth(method="lm") + facet_wrap("gear") )
-
Adjust the themes
I) Make it playful
( ggplot(mtcars, aes("wt", "mpg", color="factor(gear)")) + geom_point() + stat_smooth(method="lm") + facet_wrap("gear") + theme_xkcd() )
II) Or professional
( ggplot(mtcars, aes("wt", "mpg", color="factor(gear)")) + geom_point() + stat_smooth(method="lm") + facet_wrap("gear") + theme_tufte() )
Installation
Official release
# Using pip
$ pip install plotnine # 1. should be sufficient for most
$ pip install 'plotnine[extra]' # 2. includes extra/optional packages
$ pip install 'plotnine[test]' # 3. testing
$ pip install 'plotnine[doc]' # 4. generating docs
$ pip install 'plotnine[dev]' # 5. development (making releases)
$ pip install 'plotnine[all]' # 6. everything
# Or using conda
$ conda install -c conda-forge plotnine
Development version
$ pip install git+https://github.com/has2k1/plotnine.git
Contributing
Our documentation could use some examples, but we are looking for something a little bit special. We have two criteria:
- Simple looking plots that otherwise require a trick or two.
- Plots that are part of a data analytic narrative. That is, they provide
some form of clarity showing off the
geom
,stat
, ... at their differential best.
If you come up with something that meets those criteria, we would love to see it. See plotnine-examples.
If you discover a bug checkout the issues if it has not been reported, yet please file an issue.
And if you can fix a bug, your contribution is welcome.
Testing
Plotnine has tests that generate images which are compared to baseline images known
to be correct. To generate images that are consistent across all systems you have
to install matplotlib from source. You can do that with pip
using the command.
$ pip install matplotlib --no-binary matplotlib
Otherwise there may be small differences in the text rendering that throw off the image comparisons.
Top Related Projects
Statistical data visualization in Python
The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
matplotlib: plotting with Python
Interactive Data Visualization in the browser, from Python
Declarative statistical visualization library for Python
With Holoviews, your data visualizes itself.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot