Convert Figma logo to code with AI

pinterest logoquerybook

Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.

1,975
241
1,975
158

Top Related Projects

62,268

Apache Superset is a Data Visualization and Data Exploration Platform

38,417

The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:

26,178

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

5,132

Free, open-source SQL client for Windows and Mac 🦅

Quick Overview

Querybook is an open-source big data IDE developed by Pinterest. It provides a web-based interface for writing, running, and sharing SQL queries across multiple data engines. Querybook aims to improve data discovery, collaboration, and analysis within organizations.

Pros

  • Supports multiple data engines (e.g., Presto, Hive, Snowflake)
  • Offers collaborative features like query sharing and version control
  • Provides a user-friendly interface with syntax highlighting and auto-completion
  • Includes data lineage and metadata exploration capabilities

Cons

  • Requires significant setup and configuration for enterprise use
  • May have a learning curve for users new to big data querying
  • Limited customization options compared to some proprietary solutions
  • Dependency on specific backend technologies may limit flexibility

Getting Started

To set up Querybook locally:

  1. Clone the repository:

    git clone https://github.com/pinterest/querybook.git
    cd querybook
    
  2. Install dependencies:

    pip install -r requirements.txt
    npm install
    
  3. Set up the database:

    python querybook/scripts/init_db.py
    
  4. Start the development server:

    python querybook/scripts/runserver.py
    
  5. Access Querybook at http://localhost:10001 in your web browser.

For detailed installation and configuration instructions, refer to the project's documentation.

Competitor Comparisons

62,268

Apache Superset is a Data Visualization and Data Exploration Platform

Pros of Superset

  • More mature and widely adopted project with a larger community and ecosystem
  • Offers a broader range of visualization types and chart options
  • Provides more advanced features for data exploration and dashboard creation

Cons of Superset

  • Steeper learning curve and more complex setup process
  • Requires more system resources and may be overkill for simpler use cases
  • Less focused on collaborative query editing and sharing compared to Querybook

Code Comparison

Superset (Python):

from superset import db
from superset.models.slice import Slice

slice = Slice(
    slice_name="My Chart",
    datasource_type="table",
    datasource_id=1,
    viz_type="bar",
    params="{}"
)
db.session.add(slice)
db.session.commit()

Querybook (TypeScript):

import { QueryExecutionAPI } from 'lib/api/QueryExecutionAPI';

const executeQuery = async (queryId: number) => {
  const result = await QueryExecutionAPI.executeQuery(queryId);
  return result.data;
};

The code snippets demonstrate different aspects of each project. Superset's example shows creating a chart using its data model, while Querybook's example focuses on executing a query through its API.

38,417

The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:

Pros of Metabase

  • More mature and widely adopted project with a larger community
  • Offers a user-friendly interface for non-technical users to create visualizations
  • Supports a wider range of databases and data sources out-of-the-box

Cons of Metabase

  • Less focused on collaborative query editing and version control
  • May be more resource-intensive for large-scale deployments
  • Limited customization options for advanced users compared to Querybook

Code Comparison

Metabase (Clojure):

(defn run-query
  [query]
  (let [database (db/select-one Database :id (:database query))
        driver   (driver/database-id->driver (:id database))]
    (driver/execute-query driver query)))

Querybook (Python):

def run_query(query, engine):
    with engine.connect() as connection:
        result = connection.execute(query)
        return result.fetchall()

Both projects handle query execution, but Metabase's implementation is more abstracted and supports multiple database drivers, while Querybook's approach is more straightforward and relies on SQLAlchemy for database connections.

26,178

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Pros of Redash

  • More mature project with a larger community and wider adoption
  • Supports a broader range of data sources out-of-the-box
  • Offers a more polished and user-friendly interface for non-technical users

Cons of Redash

  • Less focus on collaboration features compared to Querybook
  • May require more setup and configuration for complex environments
  • Limited built-in version control for queries

Code Comparison

Redash query execution:

query_runner = get_query_runner(data_source.type, data_source.options)
data, error = query_runner.run_query(query, user)

Querybook query execution:

engine = get_query_engine(data_source)
result = engine.execute_query(query, user)

Both projects use similar approaches for query execution, with slight differences in naming conventions and method signatures. Redash's implementation appears more straightforward, while Querybook's may offer more flexibility in terms of engine customization.

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Pros of Zeppelin

  • More mature project with a larger community and ecosystem
  • Supports a wider range of programming languages and data processing frameworks
  • Offers built-in visualization tools for data exploration

Cons of Zeppelin

  • Heavier resource requirements and more complex setup process
  • Steeper learning curve for new users
  • Less focus on collaborative features compared to Querybook

Code Comparison

Zeppelin notebook cell (Python):

%python
import pandas as pd
df = pd.read_csv('data.csv')
df.head()

Querybook query cell (SQL):

SELECT *
FROM my_table
LIMIT 5;

While both tools support various languages, Zeppelin uses a % syntax to specify the interpreter, whereas Querybook is primarily focused on SQL queries. Zeppelin's notebooks can include multiple languages and visualization blocks, while Querybook is more streamlined for SQL-based data exploration and collaboration.

Zeppelin offers a more comprehensive data science platform with support for multiple languages and built-in visualizations. Querybook, on the other hand, provides a more focused and user-friendly experience for SQL-based data exploration and collaboration within organizations.

5,132

Free, open-source SQL client for Windows and Mac 🦅

Pros of Falcon

  • Built with React and TypeScript, offering a modern and type-safe frontend development experience
  • Focuses on interactive data visualization and dashboarding capabilities
  • Provides a more customizable and extensible architecture for building data applications

Cons of Falcon

  • Less emphasis on collaborative features and team-oriented workflows
  • May require more setup and configuration compared to Querybook's out-of-the-box solution
  • Smaller community and fewer resources available for support and troubleshooting

Code Comparison

Querybook (Python):

from querybook.models import DataDoc

data_doc = DataDoc.create(
    title="My Data Document",
    owner_uid=user.id,
    environment_id=environment.id
)

Falcon (TypeScript):

import { Dashboard } from '@plotly/falcon';

const dashboard = new Dashboard({
  title: 'My Dashboard',
  layout: 'grid',
  items: [
    { type: 'chart', query: 'SELECT * FROM users' }
  ]
});

Both repositories offer solutions for data exploration and analysis, but they cater to different use cases. Querybook is more focused on collaborative SQL editing and execution, while Falcon emphasizes interactive data visualization and dashboard creation. The choice between the two depends on specific project requirements and team preferences.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Querybook

Build Status License Slack

Querybook is a Big Data IDE that allows you to discover, create, and share data analyses, queries, and tables. Check out the full documentation & feature highlights here.

Features

  • 📚 Organize analyses with rich text, queries, and charts
  • ✏️ Compose queries with autocompletion and hovering tooltip
  • 📈 Use scheduling + charting in DataDocs to build dashboards
  • 🙌 Live query collaborations with others
  • 📝 Add additional documentation to your tables
  • 🧮 Get lineage, sample queries, frequent user, search ranking based on past query runs

Getting started

Prerequisite

Please install Docker before trying out Querybook.

Quick setup

Pull this repo and run make. Visit http://localhost:10001 when the build completes.

For more details on installation, click here

Configuration

For infrastructure configuration, click here For general configuration, click here

Supported Integrations

Query Engines

  • Presto
  • Hive
  • Druid
  • Snowflake
  • Big Query
  • MySQL
  • Sqlite
  • PostgreSQL
  • and many more...

Authentication

  • User/Password
  • OAuth
    • Google Cloud OAuth
    • Okta OAuth
    • GitHub OAuth
    • Auth0 OAuth
  • LDAP

Metastore

Can be used to fetch schema and table information for metadata enrichment.

  • Hive Metastore
  • Sqlalchemy Inspect
  • AWS Glue Data Catalog

Result Storage

Use one of the following to store query results.

  • Database (MySQL, Postgres, etc)
  • S3
  • Google Cloud Storage
  • Local file

Result Export

Upload query results from Querybook to other tools for further analyses.

  • Google Sheets Export
  • Python export

Notification

Get notified upon completion of queries and DataDoc invitations via IM or email.

  • Email
  • Slack

User Interface

Query Editor

Charting

Scheduling

Lineage & Analytics

Contributing Back

See CONTRIBUTING.