Convert Figma logo to code with AI

jenssegers logoimagehash

🌄 Perceptual image hashing for PHP

1,984
175
1,984
37

Top Related Projects

15,631

A very compact representation of a placeholder for an image.

A Python Perceptual Image Hashing Module

😎 Finding duplicate images made easy!

Python library for accurately querying username and email usage on online platforms

Quick Overview

jenssegers/imagehash is a PHP library for generating perceptual hashes of images. It allows for comparing images based on their content rather than byte-by-byte comparison, making it useful for finding similar or duplicate images even if they have been resized, compressed, or slightly modified.

Pros

  • Supports multiple hashing algorithms (Average, Difference, Perception, and DCT)
  • Can compare images across different formats and sizes
  • Provides a simple API for easy integration into existing projects
  • Includes methods for calculating hash distances and finding similar images

Cons

  • Limited to PHP, which may not be suitable for all projects
  • Requires GD or Imagick extension, which might not be available on all servers
  • May have performance limitations for large-scale image processing tasks
  • Documentation could be more comprehensive for advanced use cases

Code Examples

  1. Generating an image hash:
use Jenssegers\ImageHash\ImageHash;
use Jenssegers\ImageHash\Implementations\DifferenceHash;

$hasher = new ImageHash(new DifferenceHash());
$hash = $hasher->hash('path/to/image.jpg');
echo $hash;
  1. Comparing two images:
$hasher = new ImageHash(new DifferenceHash());
$hash1 = $hasher->hash('path/to/image1.jpg');
$hash2 = $hasher->hash('path/to/image2.jpg');

$distance = $hash1->distance($hash2);
echo "Distance between images: $distance";
  1. Finding similar images in a directory:
$hasher = new ImageHash(new DifferenceHash());
$targetHash = $hasher->hash('path/to/target.jpg');

$similarImages = [];
foreach (glob('path/to/images/*.jpg') as $image) {
    $hash = $hasher->hash($image);
    if ($hash->distance($targetHash) <= 5) {
        $similarImages[] = $image;
    }
}

print_r($similarImages);

Getting Started

  1. Install the library using Composer:
composer require jenssegers/imagehash
  1. Use the library in your PHP code:
use Jenssegers\ImageHash\ImageHash;
use Jenssegers\ImageHash\Implementations\AverageHash;

$hasher = new ImageHash(new AverageHash());
$hash = $hasher->hash('path/to/image.jpg');
echo $hash;

Make sure you have the GD or Imagick extension installed and enabled in your PHP environment.

Competitor Comparisons

15,631

A very compact representation of a placeholder for an image.

Pros of Blurhash

  • Generates visually appealing placeholder images
  • Compact representation, suitable for quick loading in mobile apps
  • Supports both encoding and decoding of images

Cons of Blurhash

  • Limited to creating blurred placeholders, not suitable for image comparison
  • Requires more processing power for encoding/decoding
  • Less versatile for general image hashing purposes

Code Comparison

Blurhash encoding:

import blurhash
hash = blurhash.encode("image.jpg", x_components=4, y_components=3)

Imagehash perceptual hashing:

from PIL import Image
import imagehash
hash = imagehash.average_hash(Image.open('image.jpg'))

Key Differences

  • Blurhash focuses on creating visually pleasing placeholders, while Imagehash is designed for image comparison and similarity detection.
  • Blurhash produces a string representation that can be decoded into a blurred image, whereas Imagehash generates a hash for numerical comparison.
  • Imagehash offers multiple hashing algorithms (average, perceptual, difference, etc.), while Blurhash uses a single algorithm for its blur effect.

Use Cases

  • Blurhash: Ideal for creating loading placeholders in mobile apps or websites.
  • Imagehash: Better suited for tasks like duplicate image detection, reverse image search, or copyright infringement detection.

Community and Maintenance

Both projects are actively maintained, with Blurhash having a more focused scope and Imagehash offering a broader range of image hashing techniques.

A Python Perceptual Image Hashing Module

Pros of imagehash (JohannesBuchner)

  • More comprehensive set of hashing algorithms, including wavelet hashing
  • Better documentation and examples
  • Actively maintained with regular updates

Cons of imagehash (JohannesBuchner)

  • Slightly more complex API
  • Requires additional dependencies (e.g., PyWavelets)

Code Comparison

imagehash (jenssegers):

from imagehash import average_hash
hash = average_hash(Image.open('image.jpg'))

imagehash (JohannesBuchner):

import imagehash
hash = imagehash.average_hash(Image.open('image.jpg'))

The code usage is very similar, with the main difference being the import statement. JohannesBuchner's version requires importing the entire module, while jenssegers' version allows for direct import of specific functions.

Both libraries provide similar core functionality for image hashing, but JohannesBuchner's imagehash offers a more extensive set of features and better long-term support. The choice between the two depends on the specific requirements of your project and the level of complexity you're comfortable with.

😎 Finding duplicate images made easy!

Pros of imagededup

  • Offers multiple hashing algorithms (average, difference, wavelet, color)
  • Includes image deduplication functionality out-of-the-box
  • Provides both CLI and Python API for ease of use

Cons of imagededup

  • Larger codebase, potentially more complex to understand and modify
  • Slower performance for large-scale image processing tasks
  • Requires more dependencies, which may increase setup time

Code Comparison

imagededup:

from imagededup.methods import PHash
phasher = PHash()
encodings = phasher.encode_images(image_dir='path/to/images/')
duplicates = phasher.find_duplicates(encoding_map=encodings)

imagehash:

import imagehash
from PIL import Image
hash = imagehash.average_hash(Image.open('path/to/image.jpg'))

Summary

imagededup offers more comprehensive functionality for image deduplication tasks, including multiple hashing algorithms and built-in deduplication features. However, it may be slower and more complex than imagehash for simpler use cases. imagehash provides a more lightweight solution with faster performance for basic image hashing needs, but lacks some of the advanced features found in imagededup.

Python library for accurately querying username and email usage on online platforms

Pros of socialscan

  • Focuses on username and email availability checking across multiple social media platforms
  • Provides asynchronous functionality for faster results
  • Offers a command-line interface for easy use

Cons of socialscan

  • Limited to social media account checking, not image-related functionality
  • May require frequent updates to maintain accuracy as social media platforms change
  • Potentially less versatile for general-purpose use compared to image hashing

Code comparison

socialscan:

async def check_username(username):
    checkers = [
        FacebookChecker(username),
        TwitterChecker(username),
        InstagramChecker(username)
    ]
    results = await asyncio.gather(*[checker.check() for checker in checkers])
    return results

imagehash:

def average_hash(image, hash_size=8):
    image = image.convert("L").resize((hash_size, hash_size), Image.ANTIALIAS)
    pixels = list(image.getdata())
    avg = sum(pixels) / len(pixels)
    return sum((1 if pixel > avg else 0) << i for i, pixel in enumerate(pixels))

Summary

socialscan is specialized for checking username and email availability across social media platforms, offering asynchronous functionality and a CLI. imagehash, on the other hand, focuses on image hashing techniques for similarity comparison. While socialscan is more targeted towards social media account verification, imagehash provides broader utility for image-related tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

ImageHash

Latest Stable Version Build Status Coverage Status Donate

A perceptual hash is a fingerprint of a multimedia file derived from various features from its content. Unlike cryptographic hash functions which rely on the avalanche effect of small changes in input leading to drastic changes in the output, perceptual hashes are "close" to one another if the features are similar.

Perceptual hashes are a different concept compared to cryptographic hash functions like MD5 and SHA1. With cryptographic hashes, the hash values are random. The data used to generate the hash acts like a random seed, so the same data will generate the same result, but different data will create different results. Comparing two SHA1 hash values really only tells you two things. If the hashes are different, then the data is different. And if the hashes are the same, then the data is likely the same. In contrast, perceptual hashes can be compared -- giving you a sense of similarity between the two data sets.

This code was inspired/based on:

Requirements

  • PHP 8.1 or higher
  • The gd or imagick extension
  • Optionally, install the GMP extension for faster fingerprint comparisons

Installation

This package has not reached a stable version yet, backwards compatibility may be broken between 0.x releases. Make sure to lock your version if you intend to use this in production!

Install using composer:

composer require jenssegers/imagehash

Usage

The library comes with 4 built-in hashing implementations:

  • Jenssegers\ImageHash\Implementations\AverageHash - Hash based the average image color
  • Jenssegers\ImageHash\Implementations\DifferenceHash - Hash based on the previous pixel
  • Jenssegers\ImageHash\Implementations\BlockHash - Hash based on blockhash.io Still under development
  • Jenssegers\ImageHash\Implementations\PerceptualHash - The original pHash Still under development

Choose one of these implementations. If you don't know which one to use, try the DifferenceHash implementation. Some implementations allow some configuration, be sure to check the constructor.

use Jenssegers\ImageHash\ImageHash;
use Jenssegers\ImageHash\Implementations\DifferenceHash;

$hasher = new ImageHash(new DifferenceHash());
$hash = $hasher->hash('path/to/image.jpg');

echo $hash;
// or
echo $hash->toHex();

The resulting Hash object, is a hexadecimal image fingerprint that can be stored in your database once calculated. The hamming distance is used to compare two image fingerprints for similarities. Low distance values will indicate that the images are similar or the same, high distance values indicate that the images are different. Use the following method to detect if images are similar or not:

$distance = $hasher->distance($hash1, $hash2);
// or
$distance = $hash1->distance($hash2);

Equal images will not always have a distance of 0, so you will need to decide at which distance you will evaluate images as equal. For the image set that I tested, a max distance of 5 was acceptable. But this will depend on the implementation, the images and the number of images. For example; when comparing a small set of images, a lower maximum distances should be acceptable as the chances of false positives are quite low. If however you are comparing a large amount of images, 5 might already be too much.

The Hash object can return the internal binary hash in a couple of different format:

echo $hash->toHex(); // 7878787c7c707c3c
echo $hash->toBits(); // 0111100001111000011110000111110001111100011100000111110000111100
echo $hash->toInt(); // 8680820757815655484
echo $hash->toBytes(); // "\x0F\x07ƒƒ\x03\x0F\x07\x00"

Choose your preference for storing your hashes in your database. If you want to reconstruct a Hash object from a previous calculated value, use:

$hash = Hash::fromHex('7878787c7c707c3c');
$hash = Hash::fromBin('0111100001111000011110000111110001111100011100000111110000111100');
$hash = Hash::fromInt('8680820757815655484');

Demo

These images are similar:

Equals1 Equals2

Image 1 hash: 3c3e0e1a3a1e1e1e (0011110000111110000011100001101000111010000111100001111000011110)
Image 2 hash: 3c3e0e3e3e1e1e1e (0011110000111110000011100011111000111110000111100001111000011110)
Hamming distance: 3

These images are different:

Equals1 Equals2

Image 1 hash: 69684858535b7575 (0010100010101000101010001010100010101011001010110101011100110111)
Image 2 hash: e1e1e2a7bbaf6faf (0111000011110000111100101101001101011011011101010011010101001111)
Hamming distance: 32

Security contact information

To report a security vulnerability, follow these steps.