querybook
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
Top Related Projects
Apache Superset is a Data Visualization and Data Exploration Platform
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Free, open-source SQL client for Windows and Mac 🦅
Quick Overview
Querybook is an open-source big data IDE developed by Pinterest. It provides a web-based interface for writing, running, and sharing SQL queries across multiple data engines. Querybook aims to improve data discovery, collaboration, and analysis within organizations.
Pros
- Supports multiple data engines (e.g., Presto, Hive, Snowflake)
- Offers collaborative features like query sharing and version control
- Provides a user-friendly interface with syntax highlighting and auto-completion
- Includes data lineage and metadata exploration capabilities
Cons
- Requires significant setup and configuration for enterprise use
- May have a learning curve for users new to big data querying
- Limited customization options compared to some proprietary solutions
- Dependency on specific backend technologies may limit flexibility
Getting Started
To set up Querybook locally:
-
Clone the repository:
git clone https://github.com/pinterest/querybook.git cd querybook
-
Install dependencies:
pip install -r requirements.txt npm install
-
Set up the database:
python querybook/scripts/init_db.py
-
Start the development server:
python querybook/scripts/runserver.py
-
Access Querybook at
http://localhost:10001
in your web browser.
For detailed installation and configuration instructions, refer to the project's documentation.
Competitor Comparisons
Apache Superset is a Data Visualization and Data Exploration Platform
Pros of Superset
- More mature and widely adopted project with a larger community and ecosystem
- Offers a broader range of visualization types and chart options
- Provides more advanced features for data exploration and dashboard creation
Cons of Superset
- Steeper learning curve and more complex setup process
- Requires more system resources and may be overkill for simpler use cases
- Less focused on collaborative query editing and sharing compared to Querybook
Code Comparison
Superset (Python):
from superset import db
from superset.models.slice import Slice
slice = Slice(
slice_name="My Chart",
datasource_type="table",
datasource_id=1,
viz_type="bar",
params="{}"
)
db.session.add(slice)
db.session.commit()
Querybook (TypeScript):
import { QueryExecutionAPI } from 'lib/api/QueryExecutionAPI';
const executeQuery = async (queryId: number) => {
const result = await QueryExecutionAPI.executeQuery(queryId);
return result.data;
};
The code snippets demonstrate different aspects of each project. Superset's example shows creating a chart using its data model, while Querybook's example focuses on executing a query through its API.
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
Pros of Metabase
- More mature and widely adopted project with a larger community
- Offers a user-friendly interface for non-technical users to create visualizations
- Supports a wider range of databases and data sources out-of-the-box
Cons of Metabase
- Less focused on collaborative query editing and version control
- May be more resource-intensive for large-scale deployments
- Limited customization options for advanced users compared to Querybook
Code Comparison
Metabase (Clojure):
(defn run-query
[query]
(let [database (db/select-one Database :id (:database query))
driver (driver/database-id->driver (:id database))]
(driver/execute-query driver query)))
Querybook (Python):
def run_query(query, engine):
with engine.connect() as connection:
result = connection.execute(query)
return result.fetchall()
Both projects handle query execution, but Metabase's implementation is more abstracted and supports multiple database drivers, while Querybook's approach is more straightforward and relies on SQLAlchemy for database connections.
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Pros of Redash
- More mature project with a larger community and wider adoption
- Supports a broader range of data sources out-of-the-box
- Offers a more polished and user-friendly interface for non-technical users
Cons of Redash
- Less focus on collaboration features compared to Querybook
- May require more setup and configuration for complex environments
- Limited built-in version control for queries
Code Comparison
Redash query execution:
query_runner = get_query_runner(data_source.type, data_source.options)
data, error = query_runner.run_query(query, user)
Querybook query execution:
engine = get_query_engine(data_source)
result = engine.execute_query(query, user)
Both projects use similar approaches for query execution, with slight differences in naming conventions and method signatures. Redash's implementation appears more straightforward, while Querybook's may offer more flexibility in terms of engine customization.
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Pros of Zeppelin
- More mature project with a larger community and ecosystem
- Supports a wider range of programming languages and data processing frameworks
- Offers built-in visualization tools for data exploration
Cons of Zeppelin
- Heavier resource requirements and more complex setup process
- Steeper learning curve for new users
- Less focus on collaborative features compared to Querybook
Code Comparison
Zeppelin notebook cell (Python):
%python
import pandas as pd
df = pd.read_csv('data.csv')
df.head()
Querybook query cell (SQL):
SELECT *
FROM my_table
LIMIT 5;
While both tools support various languages, Zeppelin uses a %
syntax to specify the interpreter, whereas Querybook is primarily focused on SQL queries. Zeppelin's notebooks can include multiple languages and visualization blocks, while Querybook is more streamlined for SQL-based data exploration and collaboration.
Zeppelin offers a more comprehensive data science platform with support for multiple languages and built-in visualizations. Querybook, on the other hand, provides a more focused and user-friendly experience for SQL-based data exploration and collaboration within organizations.
Free, open-source SQL client for Windows and Mac 🦅
Pros of Falcon
- Built with React and TypeScript, offering a modern and type-safe frontend development experience
- Focuses on interactive data visualization and dashboarding capabilities
- Provides a more customizable and extensible architecture for building data applications
Cons of Falcon
- Less emphasis on collaborative features and team-oriented workflows
- May require more setup and configuration compared to Querybook's out-of-the-box solution
- Smaller community and fewer resources available for support and troubleshooting
Code Comparison
Querybook (Python):
from querybook.models import DataDoc
data_doc = DataDoc.create(
title="My Data Document",
owner_uid=user.id,
environment_id=environment.id
)
Falcon (TypeScript):
import { Dashboard } from '@plotly/falcon';
const dashboard = new Dashboard({
title: 'My Dashboard',
layout: 'grid',
items: [
{ type: 'chart', query: 'SELECT * FROM users' }
]
});
Both repositories offer solutions for data exploration and analysis, but they cater to different use cases. Querybook is more focused on collaborative SQL editing and execution, while Falcon emphasizes interactive data visualization and dashboard creation. The choice between the two depends on specific project requirements and team preferences.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Querybook
Querybook is a Big Data IDE that allows you to discover, create, and share data analyses, queries, and tables. Check out the full documentation & feature highlights here.
Features
- ð Organize analyses with rich text, queries, and charts
- âï¸ Compose queries with autocompletion and hovering tooltip
- ð Use scheduling + charting in DataDocs to build dashboards
- ð Live query collaborations with others
- ð Add additional documentation to your tables
- 𧮠Get lineage, sample queries, frequent user, search ranking based on past query runs
Getting started
Prerequisite
Please install Docker before trying out Querybook.
Quick setup
Pull this repo and run make
. Visit http://localhost:10001 when the build completes.
For more details on installation, click here
Configuration
For infrastructure configuration, click here For general configuration, click here
Supported Integrations
Query Engines
- Presto
- Hive
- Druid
- Snowflake
- Big Query
- MySQL
- Sqlite
- PostgreSQL
- and many more...
Authentication
- User/Password
- OAuth
- Google Cloud OAuth
- Okta OAuth
- GitHub OAuth
- Auth0 OAuth
- LDAP
Metastore
Can be used to fetch schema and table information for metadata enrichment.
- Hive Metastore
- Sqlalchemy Inspect
- AWS Glue Data Catalog
Result Storage
Use one of the following to store query results.
- Database (MySQL, Postgres, etc)
- S3
- Google Cloud Storage
- Local file
Result Export
Upload query results from Querybook to other tools for further analyses.
- Google Sheets Export
- Python export
Notification
Get notified upon completion of queries and DataDoc invitations via IM or email.
- Slack
User Interface
Query Editor
Charting
Scheduling
Lineage & Analytics
Contributing Back
See CONTRIBUTING.
Top Related Projects
Apache Superset is a Data Visualization and Data Exploration Platform
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Free, open-source SQL client for Windows and Mac 🦅
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot