EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

2,362

154

2,362

View on GitHub

Top Related Projects

segment-anything

50,632

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Grounded-Segment-Anything

16,523

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

MobileSAM

5,249

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

sam-hq

4,001

Segment Anything in High Quality [NeurIPS 2023]

anylabeling

2,770

Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything (SAM+SAM2), MobileSAM!!

Quick Overview

EfficientSAM is an optimized implementation of the Segment Anything Model (SAM) designed for efficient segmentation tasks. It aims to reduce the computational requirements of the original SAM while maintaining high-quality segmentation results, making it more accessible for various applications and hardware configurations.

Pros

Significantly reduced model size and computational requirements compared to the original SAM
Maintains high-quality segmentation results despite optimizations
Faster inference times, enabling real-time or near-real-time applications
Improved accessibility for deployment on resource-constrained devices

Cons

May have slightly reduced accuracy compared to the full SAM model in some complex scenarios
Limited documentation and examples compared to the original SAM implementation
Potential compatibility issues with existing SAM-based workflows due to architectural changes
Relatively new project, which may lead to fewer community resources and support

Code Examples

Loading the EfficientSAM model:

from efficient_sam import build_efficient_sam

model = build_efficient_sam('efficient_sam_s.pt')
model.eval()

Performing image segmentation:

import cv2
import numpy as np

image = cv2.imread('image.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

masks, _, _ = model.predict(image)

Visualizing segmentation results:

import matplotlib.pyplot as plt

plt.figure(figsize=(10, 10))
plt.imshow(image)
for mask in masks:
    plt.imshow(mask, alpha=0.5)
plt.axis('off')
plt.show()

Getting Started

To get started with EfficientSAM, follow these steps:

Clone the repository:

git clone https://github.com/yformer/EfficientSAM.git
cd EfficientSAM

Install the required dependencies:
```
pip install -r requirements.txt
```

Download the pre-trained model:

wget https://github.com/yformer/EfficientSAM/releases/download/v1.0/efficient_sam_s.pt

Use the model in your Python script as shown in the code examples above.

Competitor Comparisons

segment-anything

50,632

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Pros of segment-anything

Highly versatile and can handle a wide range of segmentation tasks
Backed by Meta AI, with extensive documentation and community support
Offers pre-trained models for immediate use

Cons of segment-anything

Computationally intensive, requiring significant resources for real-time applications
Large model size, which can be challenging for deployment on edge devices
May be overkill for simpler segmentation tasks

Code Comparison

segment-anything:

from segment_anything import sam_model_registry, SamPredictor

sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
predictor = SamPredictor(sam)
predictor.set_image(image)
masks, _, _ = predictor.predict(point_coords=input_point, point_labels=input_label)

EfficientSAM:

from efficient_sam.build_efficient_sam import build_efficient_sam
from efficient_sam.efficient_sam import SamPredictor

model = build_efficient_sam(checkpoint="efficient_sam_s.pt")
predictor = SamPredictor(model)
predictor.set_image(image)
masks, _, _ = predictor.predict(point_coords=input_point, point_labels=input_label)

Grounded-Segment-Anything

16,523

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Pros of Grounded-Segment-Anything

Integrates grounding DINO for more accurate object detection and segmentation
Supports text-to-mask generation, enhancing versatility in various applications
Offers a wider range of pre-trained models and datasets

Cons of Grounded-Segment-Anything

Higher computational requirements due to additional components
More complex setup and installation process
Potentially slower inference time compared to EfficientSAM

Code Comparison

EfficientSAM:

from efficient_sam import sam_model_registry, SamAutomaticMaskGenerator

sam = sam_model_registry["efficient_sam_s"](checkpoint="path/to/checkpoint")
mask_generator = SamAutomaticMaskGenerator(sam)
masks = mask_generator.generate(image)

Grounded-Segment-Anything:

from segment_anything import sam_model_registry, SamPredictor
from groundingdino.util.inference import load_model, load_image, predict

sam = sam_model_registry["vit_h"](checkpoint="path/to/sam_checkpoint")
grounding_dino_model = load_model("path/to/groundingdino_checkpoint")
predictor = SamPredictor(sam)
boxes, logits, phrases = predict(grounding_dino_model, image, text_prompt)
masks, _, _ = predictor.predict(point_coords=None, point_labels=None, box=boxes[0])

MobileSAM

5,249

This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!

Pros of MobileSAM

Significantly faster inference speed, making it more suitable for mobile and real-time applications
Smaller model size, requiring less storage and memory
Maintains comparable performance to the original SAM model

Cons of MobileSAM

Slightly lower accuracy compared to EfficientSAM in some scenarios
Less flexibility in terms of input image sizes and resolutions

Code Comparison

MobileSAM:

from mobile_sam import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor

sam = sam_model_registry["vit_t"](checkpoint="mobile_sam.pt")
mask_generator = SamAutomaticMaskGenerator(sam)
masks = mask_generator.generate(image)

EfficientSAM:

from efficient_sam import build_efficient_sam, SamAutomaticMaskGenerator

model = build_efficient_sam(checkpoint="efficient_sam.pt")
mask_generator = SamAutomaticMaskGenerator(model)
masks = mask_generator.generate(image)

Both repositories aim to improve the efficiency of the Segment Anything Model (SAM). MobileSAM focuses on speed and mobile deployment, while EfficientSAM aims for a balance between efficiency and accuracy. The code usage is similar, with minor differences in model initialization and import statements.

sam-hq

4,001

Segment Anything in High Quality [NeurIPS 2023]

Pros of sam-hq

Higher quality segmentation masks with more precise boundaries
Better performance on high-resolution images
Improved handling of complex scenes with multiple objects

Cons of sam-hq

Potentially slower inference time due to higher complexity
May require more computational resources
Limited to specific use cases where high-quality segmentation is critical

Code Comparison

sam-hq:

predictor = SamPredictor(sam)
predictor.set_image(image)
masks, _, _ = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    multimask_output=True,
)

EfficientSAM:

sam = sam_model_registry["vit_t"](checkpoint="weights/mobile_sam.pt")
predictor = SamPredictor(sam)
predictor.set_image(image)
masks, _, _ = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
)

The main difference in the code is that EfficientSAM uses a more lightweight model ("vit_t") and doesn't have the multimask_output parameter, potentially resulting in faster but less detailed predictions compared to sam-hq.

anylabeling

2,770

Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything (SAM+SAM2), MobileSAM!!

Pros of Anylabeling

More comprehensive labeling tool with support for various annotation types (bounding boxes, polygons, keypoints, etc.)
User-friendly GUI interface for easier annotation and project management
Supports multiple AI models and integrates with popular frameworks like YOLO and Segment Anything

Cons of Anylabeling

Larger codebase and potentially more complex setup compared to EfficientSAM
May have higher system requirements due to its extensive features
Less focused on a specific task, which could impact performance in specialized use cases

Code Comparison

EfficientSAM:

sam = sam_model_registry[args.model_type](checkpoint=args.checkpoint)
predictor = SamPredictor(sam)
masks, _, _ = predictor.predict(point_coords=input_point, point_labels=input_label, multimask_output=True)

Anylabeling:

self.sam_controller = SamController()
self.sam_controller.load_sam_model(model_path, model_type, device)
masks, scores, logits = self.sam_controller.predict(image, input_point, input_label)

Both repositories use similar approaches for loading and predicting with SAM models, but Anylabeling wraps the functionality in a controller class for better integration with its GUI-based system.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Update -- Efficient Track Anything

Our new release on Efficient Track Anything.

Efficient Track Anything code: https://github.com/yformer/EfficientTAM
Efficient Track Anything project (with gradio demo): https://yformer.github.io/efficient-track-anything/
Efficient Track Anything paper: https://arxiv.org/pdf/2411.18933

Efficient Track Anything design

Efficient Track Anything is an efficient foundation model for promptable unified image and video segmentation.

ð¤Efficient Track Anything for video segmentation

ð¤Efficient Track Anything for image segment everything

ð¤Efficient Track Anything checkpoints

EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

News

[Jan.12 2024] ONNX version of EfficientSAM including separate encoder and decoder is available on the Hugging Face Space (thanks to @wkentaro Kentaro Wada for implementing onnx export)

[Dec.31 2023] EfficientSAM is integrated into the annotation tool, Labelme (huge thanks to lableme team and @wkentaro Kentaro Wada)

[Dec.11 2023] The EfficientSAM model code with checkpoints is fully available in this repository. The example script shows how to instantiate the model with checkpoint and query points on an image.

[Dec.10 2023] Grounded EfficientSAM demo is available on Grounded-Efficient-Segment-Anything (huge thanks to IDEA-Research team and @rentainhe for supporting grounded-efficient-sam demo under Grounded-Segment-Anything).

[Dec.6 2023] EfficientSAM demo is available on the Hugging Face Space (huge thanks to all the HF team for their support).

[Dec.5 2023] We release the torchscript version of EfficientSAM and share a colab.

Online Demo & Examples

Online demo and examples can be found in the project page.

EfficientSAM Instance Segmentation Examples


Point-prompt
Box-prompt
Segment everything
Saliency

Model

EfficientSAM checkpoints are available under the weights folder of this github repository. Example instantiations and run of the models can be found in EfficientSAM_example.py.

EfficientSAM-S	EfficientSAM-Ti
Download	Download

You can directly use EfficientSAM with checkpoints,

from efficient_sam.build_efficient_sam import build_efficient_sam_vitt, build_efficient_sam_vits
efficientsam = build_efficient_sam_vitt()

Jupyter Notebook Example

The notebook is shared here

Acknowledgement

If you're using EfficientSAM in your research or applications, please cite using this BibTeX:



@article{xiong2023efficientsam,
  title={EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything},
  author={Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra},
  journal={arXiv:2312.00863},
  year={2023}
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot