Convert Figma logo to code with AI

Cloud-CV logoEvalAI

:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI

1,781
800
1,781
435

Top Related Projects

Official Kaggle API

34,643

A toolkit for developing and comparing reinforcement learning algorithms.

18,503

Open source platform for the machine learning lifecycle

9,007

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Quick Overview

EvalAI is an open-source platform for hosting and participating in AI challenges. It provides a scalable and flexible framework for organizing competitions, evaluating submissions, and managing leaderboards. EvalAI aims to streamline the process of benchmarking AI algorithms across various domains.

Pros

  • Highly customizable and extensible platform for hosting AI challenges
  • Supports both public and private challenges with fine-grained access control
  • Offers real-time evaluation and dynamic leaderboards
  • Integrates with popular cloud services for scalable infrastructure

Cons

  • Steep learning curve for setting up and configuring challenges
  • Limited documentation for advanced features and customizations
  • Requires significant server resources for large-scale competitions
  • May need additional development for specific use cases or integrations

Getting Started

To set up EvalAI locally for development:

  1. Clone the repository:

    git clone https://github.com/Cloud-CV/EvalAI.git
    cd EvalAI
    
  2. Set up the backend:

    cd backend
    pip install -r requirements/dev.txt
    python manage.py migrate
    python manage.py runserver
    
  3. Set up the frontend:

    cd frontend
    npm install
    npm start
    
  4. Access the application at http://localhost:3000

For detailed instructions on hosting challenges and participating in competitions, refer to the official documentation at https://evalai.readthedocs.io/.

Competitor Comparisons

Official Kaggle API

Pros of kaggle-api

  • Extensive API functionality for interacting with Kaggle's platform
  • Well-documented and actively maintained by Kaggle
  • Supports a wide range of Kaggle features, including dataset and kernel management

Cons of kaggle-api

  • Limited to Kaggle's ecosystem and not suitable for custom AI evaluation platforms
  • Requires Kaggle account and API credentials for usage
  • Less flexible for hosting private competitions or custom evaluation metrics

Code Comparison

kaggle-api:

from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api.authenticate()
api.dataset_download_files('dataset_name')

EvalAI:

import requests
url = 'https://evalai.cloudcv.org/api/challenges/1/submission/'
headers = {'Authorization': 'Bearer <your_token>'}
files = {'input_file': open('submission.json', 'rb')}
response = requests.post(url, headers=headers, files=files)

Key Differences

  • EvalAI is an open-source platform for custom AI challenges, while kaggle-api is specific to Kaggle's platform
  • EvalAI offers more flexibility in creating and managing custom evaluation metrics and leaderboards
  • kaggle-api provides deeper integration with Kaggle's features, such as dataset and kernel management
  • EvalAI is better suited for organizations wanting to host their own AI challenges, while kaggle-api is ideal for Kaggle users and competitions
34,643

A toolkit for developing and comparing reinforcement learning algorithms.

Pros of gym

  • Widely adopted and well-established in the reinforcement learning community
  • Extensive documentation and tutorials available
  • Supports a broad range of environments, from classic control to robotics

Cons of gym

  • Primarily focused on reinforcement learning, limiting its use in other AI domains
  • Less emphasis on collaborative evaluation and benchmarking
  • Requires more setup and configuration for custom environments

Code Comparison

gym:

import gym
env = gym.make('CartPole-v1')
observation, info = env.reset(seed=42)
for _ in range(1000):
    action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(action)

EvalAI:

from evalai_interface import EvalAI
evalai = EvalAI(auth_token='your_auth_token')
challenge_id = 1
phase_id = 1
submission_file = 'path/to/submission.json'
evalai.make_submission(challenge_id, phase_id, submission_file)

EvalAI focuses on providing a platform for AI challenges and evaluations, while gym is primarily a toolkit for developing and comparing reinforcement learning algorithms. EvalAI offers a more collaborative and competition-oriented approach, whereas gym provides a standardized environment for individual experimentation and algorithm development.

18,503

Open source platform for the machine learning lifecycle

Pros of MLflow

  • Broader scope: Covers the entire ML lifecycle including experimentation, reproducibility, deployment, and model registry
  • More mature project with a larger community and extensive documentation
  • Integrates well with popular ML frameworks and cloud platforms

Cons of MLflow

  • Steeper learning curve due to its comprehensive feature set
  • May be overkill for simple ML projects or those focused solely on evaluation
  • Less specialized for AI competition hosting compared to EvalAI

Code Comparison

MLflow tracking example:

import mlflow

with mlflow.start_run():
    mlflow.log_param("param1", 5)
    mlflow.log_metric("accuracy", 0.85)
    mlflow.sklearn.log_model(model, "model")

EvalAI submission example:

import requests

url = "https://evalai.cloudcv.org/api/submissions/"
headers = {"Authorization": "Bearer <your_token>"}
files = {"input_file": open("predictions.json", "rb")}
data = {"challenge_phase": "<phase_id>", "submission": files["input_file"]}
response = requests.post(url, headers=headers, files=files, data=data)

Both repositories serve different primary purposes, with MLflow focusing on the broader ML lifecycle and EvalAI specializing in AI competition hosting and evaluation. The code examples highlight these differences, with MLflow emphasizing experiment tracking and model logging, while EvalAI focuses on submission handling for competitions.

9,007

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Pros of wandb

  • More comprehensive experiment tracking and visualization tools
  • Broader integration with popular ML frameworks and libraries
  • Active community and frequent updates

Cons of wandb

  • Steeper learning curve for beginners
  • Requires more setup and configuration for advanced features

Code comparison

wandb:

import wandb

wandb.init(project="my-project")
wandb.config.hyperparameters = {
    "learning_rate": 0.01,
    "epochs": 100
}
model.fit(X, y)
wandb.log({"accuracy": accuracy, "loss": loss})

EvalAI:

from evalai_cli.utils.submission import submit

submit(
    challenge_id=1234,
    phase_id=5678,
    file_path="submission.json"
)

Summary

wandb offers more comprehensive experiment tracking and visualization tools, with broader integration across ML frameworks. It has an active community and frequent updates. However, it may have a steeper learning curve and require more setup for advanced features.

EvalAI focuses on hosting and managing AI challenges, providing a simpler interface for submissions and evaluations. It's more straightforward for beginners but may have limited features compared to wandb's extensive tracking capabilities.

The code examples show wandb's focus on experiment tracking and logging, while EvalAI's code is centered around challenge submissions.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README


Join the chat at https://gitter.im/Cloud-CV/EvalAI Build Status codecov Coverage Status Code style: black Code Climate Documentation Status Twitter Follow

EvalAI is an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence (AI) algorithms at scale.

In recent years, it has become increasingly difficult to compare an algorithm solving a given task with other existing approaches. These comparisons suffer from minor differences in algorithm implementation, use of non-standard dataset splits and different evaluation metrics. By providing a central leaderboard and submission interface, we make it easier for researchers to reproduce the results mentioned in the paper and perform reliable & accurate quantitative analysis. By providing swift and robust backends based on map-reduce frameworks that speed up evaluation on the fly, EvalAI aims to make it easier for researchers to reproduce results from technical papers and perform reliable and accurate analyses.

Features

  • Custom evaluation protocols and phases: We allow creation of an arbitrary number of evaluation phases and dataset splits, compatibility using any programming language, and organizing results in both public and private leaderboards.

  • Remote evaluation: Certain large-scale challenges need special compute capabilities for evaluation. If the challenge needs extra computational power, challenge organizers can easily add their own cluster of worker nodes to process participant submissions while we take care of hosting the challenge, handling user submissions, and maintaining the leaderboard.

  • Evaluation inside environments: EvalAI lets participants submit code for their agent in the form of docker images which are evaluated against test environments on the evaluation server. During evaluation, the worker fetches the image, test environment, and the model snapshot and spins up a new container to perform evaluation.

  • CLI support: evalai-cli is designed to extend the functionality of the EvalAI web application to your command line to make the platform more accessible and terminal-friendly.

  • Portability: EvalAI is designed with keeping in mind scalability and portability of such a system from the very inception of the idea. Most of the components rely heavily on open-source technologies – Docker, Django, Node.js, and PostgreSQL.

  • Faster evaluation: We warm-up the worker nodes at start-up by importing the challenge code and pre-loading the dataset in memory. We also split the dataset into small chunks that are simultaneously evaluated on multiple cores. These simple tricks result in faster evaluation and reduces the evaluation time by an order of magnitude in some cases.

Goal

Our ultimate goal is to build a centralized platform to host, participate and collaborate in AI challenges organized around the globe and we hope to help in benchmarking progress in AI.

Installation instructions

Setting up EvalAI on your local machine is really easy. You can setup EvalAI using docker: The steps are:

  1. Install docker and docker-compose on your machine.

  2. Get the source code on to your machine via git.

    git clone https://github.com/Cloud-CV/EvalAI.git evalai && cd evalai
    
  3. Build and run the Docker containers. This might take a while.

    docker-compose up --build
    
  4. That's it. Open web browser and hit the URL http://127.0.0.1:8888. Three users will be created by default which are listed below -

    SUPERUSER- username: admin password: password
    HOST USER- username: host password: password
    PARTICIPANT USER- username: participant password: password

If you are facing any issue during installation, please see our common errors during installation page.

Citing EvalAI

If you are using EvalAI for hosting challenges, please cite the following technical report:

@article{EvalAI,
    title   =  {EvalAI: Towards Better Evaluation Systems for AI Agents},
    author  =  {Deshraj Yadav and Rishabh Jain and Harsh Agrawal and Prithvijit
                Chattopadhyay and Taranjeet Singh and Akash Jain and Shiv Baran
                Singh and Stefan Lee and Dhruv Batra},
    year    =  {2019},
    volume  =  arXiv:1902.03570
}

Team

EvalAI is currently maintained by Rishabh Jain, Gunjan Chhablani . A non-exhaustive list of other major contributors includes: Deshraj Yadav, Ram Ramrakhya,Akash Jain, Taranjeet Singh, Shiv Baran Singh, Harsh Agarwal, Prithvijit Chattopadhyay, Devi Parikh and Dhruv Batra.

Contribution guidelines

If you are interested in contributing to EvalAI, follow our contribution guidelines.

Contributors