Convert Figma logo to code with AI

airbnb logoairpal

Web UI for PrestoDB.

2,753
456
2,753
87

Top Related Projects

16,120

The official home of the Presto distributed SQL query engine for big data

5,524

Apache Hive

40,184

Apache Spark - A unified analytics engine for large-scale data processing

1,938

Apache Drill is a distributed MPP query layer for self describing data

1,134

Apache Impala

Quick Overview

Airpal is a web-based query execution tool built on top of Facebook's PrestoDB. It provides a user-friendly interface for data analysts and scientists to run SQL queries, explore data, and collaborate on data analysis tasks. Airpal aims to make PrestoDB more accessible and easier to use for non-technical users.

Pros

  • User-friendly web interface for running SQL queries
  • Built-in collaboration features for sharing queries and results
  • Supports saving and scheduling queries for automated reporting
  • Integrates well with existing PrestoDB deployments

Cons

  • Limited to PrestoDB as the underlying query engine
  • May require additional setup and maintenance compared to using PrestoDB directly
  • Less flexible than raw SQL for advanced users
  • Development appears to have slowed down in recent years

Getting Started

To get started with Airpal, follow these steps:

  1. Clone the repository:

    git clone https://github.com/airbnb/airpal.git
    
  2. Build the project:

    cd airpal
    ./gradlew clean shadowJar
    
  3. Configure Airpal by creating a reference.yml file based on the provided example:

    cp reference.example.yml reference.yml
    
  4. Edit reference.yml to set up your PrestoDB connection and other settings.

  5. Run Airpal:

    java -server -Duser.timezone=UTC -cp build/libs/airpal-*-all.jar com.airbnb.airpal.AirpalApplication server reference.yml
    
  6. Access the Airpal web interface at http://localhost:8081 (or the configured port).

Note: Ensure you have Java and Gradle installed on your system before starting.

Competitor Comparisons

16,120

The official home of the Presto distributed SQL query engine for big data

Pros of Presto

  • More powerful and flexible SQL engine for big data analytics
  • Supports a wider range of data sources and connectors
  • Active development with frequent updates and improvements

Cons of Presto

  • Steeper learning curve and more complex setup
  • Requires more resources to run effectively
  • Less user-friendly interface for non-technical users

Code Comparison

Airpal (JavaScript):

var AirpalApp = React.createClass({
  render: function() {
    return (
      <div className="airpal-app">
        <Header />
        <ExecutionController />
      </div>
    );
  }
});

Presto (Java):

public class PrestoServer
        implements Runnable
{
    public static void main(String[] args)
            throws Exception
    {
        new PrestoServer().run();
    }
}

Presto offers a more robust SQL engine with broader data source support, making it suitable for complex big data analytics. However, it requires more technical expertise and resources to set up and maintain. Airpal provides a more user-friendly interface for running Presto queries but with limited functionality compared to Presto's full capabilities. The code comparison shows Airpal's focus on the frontend user interface, while Presto's core is implemented in Java for performance and scalability.

5,524

Apache Hive

Pros of Hive

  • More comprehensive data warehousing solution with broader functionality
  • Supports multiple data formats and storage systems
  • Larger community and ecosystem with extensive documentation

Cons of Hive

  • Steeper learning curve and more complex setup
  • Slower query performance for small to medium-sized datasets
  • Requires more resources and maintenance

Code Comparison

Hive SQL query:

SELECT customer_id, SUM(order_total) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(order_total) > 1000;

Airpal query (using Presto SQL):

SELECT customer_id, SUM(order_total) AS total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(order_total) > 1000;

Key Differences

  • Hive is a full-fledged data warehousing solution, while Airpal is a web-based query interface for Presto
  • Hive supports multiple storage systems, whereas Airpal is primarily designed for querying data in Hadoop
  • Hive has a more complex architecture, while Airpal focuses on simplifying the query process for end-users
  • Hive offers more advanced features like indexing and partitioning, while Airpal emphasizes ease of use and collaboration

Both tools have their strengths, with Hive being more suitable for large-scale data processing and Airpal excelling in user-friendly ad-hoc querying.

40,184

Apache Spark - A unified analytics engine for large-scale data processing

Pros of Spark

  • Powerful distributed computing framework for big data processing
  • Supports multiple programming languages (Scala, Java, Python, R)
  • Extensive ecosystem with libraries for machine learning, graph processing, and streaming

Cons of Spark

  • Steeper learning curve and more complex setup compared to Airpal
  • Requires more resources and infrastructure to run effectively
  • May be overkill for simpler data analysis tasks

Code Comparison

Airpal (SQL query):

SELECT user_id, COUNT(*) as booking_count
FROM bookings
WHERE booking_date >= '2023-01-01'
GROUP BY user_id
HAVING booking_count > 5

Spark (PySpark):

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("BookingAnalysis").getOrCreate()
df = spark.read.table("bookings")
result = df.filter(df.booking_date >= '2023-01-01') \
           .groupBy("user_id") \
           .count() \
           .filter("count > 5")

While Airpal focuses on SQL queries for data analysis, Spark offers a more programmatic approach with support for complex data processing pipelines and advanced analytics. Airpal is more user-friendly for SQL-based querying, while Spark provides greater flexibility and scalability for large-scale data processing tasks.

1,938

Apache Drill is a distributed MPP query layer for self describing data

Pros of Drill

  • Supports a wider range of data sources, including Hadoop, NoSQL, and cloud storage
  • Offers more advanced query capabilities with SQL-like syntax
  • Provides better scalability for large-scale data processing

Cons of Drill

  • Steeper learning curve and more complex setup compared to Airpal
  • Requires more system resources and infrastructure
  • Less user-friendly interface for non-technical users

Code Comparison

Drill query example:

SELECT * FROM dfs.`/path/to/data/file.json`
WHERE age > 30
LIMIT 10;

Airpal query example:

SELECT * FROM hive.default.users
WHERE age > 30
LIMIT 10;

Key Differences

  1. Query Language: Drill uses a SQL-like syntax with extensions for nested data, while Airpal primarily uses Presto SQL.
  2. Data Sources: Drill supports a broader range of data sources, including schema-less data, while Airpal focuses on Presto-compatible sources.
  3. User Interface: Airpal provides a more user-friendly web interface, whereas Drill is primarily command-line driven with optional web console.
  4. Performance: Drill is designed for high-performance querying of large-scale datasets, while Airpal is more suited for interactive queries on smaller to medium-sized datasets.
  5. Community and Support: Drill has a larger community and more extensive documentation as an Apache project, while Airpal has a smaller but active community centered around Airbnb's use case.
1,134

Apache Impala

Pros of Impala

  • Highly scalable and performant SQL query engine for Hadoop
  • Supports a wide range of data formats and storage systems
  • Offers low-latency queries on large datasets

Cons of Impala

  • Requires more complex setup and maintenance
  • Limited to Hadoop ecosystem
  • May have higher resource requirements

Code Comparison

Airpal (JavaScript):

var AirpalApp = React.createClass({
  render: function() {
    return (
      <div className="airpal-app">
        <Header />
        <ExecutionBar />
        <TabArea />
      </div>
    );
  }
});

Impala (C++):

Status ImpalaServer::ExecutePlannedStmt(
    const TQueryCtx& query_ctx,
    shared_ptr<SessionState> session_state,
    const TExecRequest& exec_request,
    TExecResult* exec_result) {
  // Implementation details
}

Airpal is a web-based query interface for PrestoDB, while Impala is a distributed SQL query engine. Airpal focuses on user-friendly query execution and visualization, whereas Impala emphasizes high-performance analytics on large-scale data. Airpal's codebase is primarily JavaScript for the frontend, while Impala is implemented in C++ for optimal performance. The choice between them depends on specific use cases, existing infrastructure, and performance requirements.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

DEPREACTED - Airpal

Airpal is deprecated, and most functionality and feature work has been moved to SQL Lab within Apache Superset.


Airpal is a web-based, query execution tool which leverages Facebook's PrestoDB to make authoring queries and retrieving results simple for users. Airpal provides the ability to find tables, see metadata, browse sample rows, write and edit queries, then submit queries all in a web interface. Once queries are running, users can track query progress and when finished, get the results back through the browser as a CSV (download it or share it with friends). The results of a query can be used to generate a new Hive table for subsequent analysis, and Airpal maintains a searchable history of all queries run within the tool.

Airpal UI

Features

  • Optional Access Control
  • Syntax highlighting
  • Results exported to a CSV for download or a Hive table
  • Query history for self and others
  • Saved queries
  • Table finder to search for appropriate tables
  • Table explorer to visualize schema of table and first 1000 rows

Requirements

  • Java 7 or higher
  • MySQL database
  • Presto 0.77 or higher
  • S3 bucket (to store CSVs)
  • Gradle 2.2 or higher

Steps to launch

  1. Build Airpal

    We'll be using Gradle to build the back-end Java code and a Node.js-based build pipeline (Browserify and Gulp) to build the front-end Javascript code.

    If you have node and npm installed locally, and wish to use them, simply run:

    ./gradlew clean shadowJar -Dairpal.useLocalNode
    

    Otherwise, node and npm will be automatically downloaded for you by running:

    ./gradlew clean shadowJar
    

    Specify Presto version by -Dairpal.prestoVersion:

    ./gradlew -Dairpal.prestoVersion=0.145 clean shadowJar
    
  2. Create a MySQL database for Airpal. We recommend you call it airpal and will assume that for future steps.

  3. Create a reference.yml file to store your configuration options.

    Start by copying over the example configuration, reference.example.yml.

    cp reference.example.yml reference.yml
    

    Then edit it to specify your MySQL credentials, and your S3 credentials if using S3 as a storage layer (Airpal defaults to local file storage, for demonstration purposes).

  4. Migrate your database.

    java -Duser.timezone=UTC \
         -cp build/libs/airpal-*-all.jar com.airbnb.airpal.AirpalApplication db migrate reference.yml
    
  5. Run Airpal.

    java -server \
         -Duser.timezone=UTC \
         -cp build/libs/airpal-*-all.jar com.airbnb.airpal.AirpalApplication server reference.yml
    
  6. Visit Airpal. Assuming you used the default settings in reference.yml you can now open http://localhost:8081 to use Airpal. Note that you might have to change the host, depending on where you deployed it.

Note: To override the configuration specified in reference.yml, you may specify certain settings on the command line in the traditional Dropwizard fashion, like so:

java -Ddw.prestoCoordinator=http://presto-coordinator-url.com \
     -Ddw.s3AccessKey=$ACCESS_KEY \
     -Ddw.s3SecretKey=$SECRET_KEY \
     -Ddw.s3Bucket=airpal \
     -Ddw.dataSourceFactory.url=jdbc:mysql://127.0.0.1:3306/airpal \
     -Ddw.dataSourceFactory.user=airpal \
     -Ddw.dataSourceFactory.password=$YOUR_PASSWORD \
     -Duser.timezone=UTC \
     -cp build/libs/airpal-*-all.jar db migrate reference.yml

Compatibility Chart

Airpal VersionPresto Versions Tested
0.10.77, 0.87, 0.145

In the Wild

Organizations and projects using airpal can list themselves here.

Contributors