Convert Figma logo to code with AI

JayBizzle logoCrawler-Detect

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

2,024
258
2,024
9

Top Related Projects

Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome :star:

The Universal Device Detection library will parse any User Agent and detect the browser, operating system, device used (desktop, tablet, mobile, tv, cars, console, etc.), brand and model.

Quick Overview

Crawler-Detect is a PHP library designed to detect bots, crawlers, and spiders accessing your website. It uses a comprehensive list of user agents and known crawler patterns to identify automated visitors, helping website owners distinguish between human and non-human traffic.

Pros

  • Large database of known crawler patterns and user agents
  • Regular updates to keep up with new bots and crawlers
  • Easy integration into existing PHP projects
  • Supports both procedural and object-oriented programming styles

Cons

  • Limited to PHP environments
  • May require frequent updates to maintain accuracy
  • Potential for false positives with less common or custom user agents
  • Performance impact on high-traffic websites due to pattern matching

Code Examples

  1. Basic usage:
use Jaybizzle\CrawlerDetect\CrawlerDetect;

$CrawlerDetect = new CrawlerDetect;

if($CrawlerDetect->isCrawler()) {
    // Handle crawler
} else {
    // Handle human visitor
}
  1. Checking a specific user agent:
$userAgent = 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)';
$CrawlerDetect = new CrawlerDetect;

if($CrawlerDetect->isCrawler($userAgent)) {
    echo "This user agent is a crawler";
}
  1. Getting the matched crawler name:
$CrawlerDetect = new CrawlerDetect;

if($CrawlerDetect->isCrawler()) {
    echo "Crawler detected: " . $CrawlerDetect->getMatches();
}

Getting Started

  1. Install via Composer:
composer require jaybizzle/crawler-detect
  1. Include in your PHP file:
require_once 'vendor/autoload.php';
use Jaybizzle\CrawlerDetect\CrawlerDetect;

$CrawlerDetect = new CrawlerDetect;

if($CrawlerDetect->isCrawler()) {
    // Crawler detected
} else {
    // Human visitor
}

Competitor Comparisons

Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome :star:

Pros of crawler-user-agents

  • Lightweight and simple JSON-based approach
  • Community-driven with frequent updates
  • Easy integration into various programming languages

Cons of crawler-user-agents

  • Limited to user agent strings only
  • Lacks advanced detection methods
  • May require additional processing for complex scenarios

Code Comparison

Crawler-Detect:

$CrawlerDetect = new Jaybizzle\CrawlerDetect\CrawlerDetect;
if($CrawlerDetect->isCrawler()) {
    // Handle crawler
}

crawler-user-agents:

import json
with open('crawler-user-agents.json') as f:
    crawlers = json.load(f)
if any(crawler['pattern'] in user_agent for crawler in crawlers):
    # Handle crawler

Summary

Crawler-Detect offers a more comprehensive solution with advanced detection methods and regular expression matching. It provides a ready-to-use PHP library with built-in functionality for detecting various crawlers and bots.

crawler-user-agents, on the other hand, is a simpler, data-driven approach that focuses on maintaining an up-to-date list of crawler user agent strings. It's more flexible in terms of language integration but requires additional implementation for detection logic.

Choose Crawler-Detect for a robust, out-of-the-box PHP solution, or opt for crawler-user-agents if you need a lightweight, customizable approach across different programming languages.

The Universal Device Detection library will parse any User Agent and detect the browser, operating system, device used (desktop, tablet, mobile, tv, cars, console, etc.), brand and model.

Pros of device-detector

  • More comprehensive detection capabilities, including devices, operating systems, and browsers
  • Regularly updated with a larger database of user agents
  • Supports multiple programming languages through ports

Cons of device-detector

  • Larger codebase and potentially higher resource usage
  • More complex setup and integration process
  • May be overkill for simple bot detection use cases

Code Comparison

Crawler-Detect:

$CrawlerDetect = new Jaybizzle\CrawlerDetect\CrawlerDetect;
if($CrawlerDetect->isCrawler()) {
    // Handle crawler
}

device-detector:

$dd = new DeviceDetector($userAgent);
$dd->parse();
if ($dd->isBot()) {
    // Handle bot
}

Both libraries offer straightforward usage, but device-detector provides more detailed information about the detected user agent. Crawler-Detect focuses solely on identifying crawlers, while device-detector offers broader device and browser detection capabilities.

device-detector's more extensive feature set comes at the cost of increased complexity and resource usage. For projects requiring only crawler detection, Crawler-Detect may be a more lightweight and focused solution. However, for applications needing comprehensive user agent analysis, device-detector's additional capabilities make it a more versatile choice.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README



crawlerdetect.io

GitHub Workflow Status

About CrawlerDetect

CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent and http_from header. Currently able to detect 1,000's of bots/spiders/crawlers.

Installation

composer require jaybizzle/crawler-detect

Usage

use Jaybizzle\CrawlerDetect\CrawlerDetect;

$CrawlerDetect = new CrawlerDetect;

// Check the user agent of the current 'visitor'
if($CrawlerDetect->isCrawler()) {
    // true if crawler user agent detected
}

// Pass a user agent as a string
if($CrawlerDetect->isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')) {
    // true if crawler user agent detected
}

// Output the name of the bot that matched (if any)
echo $CrawlerDetect->getMatches();

Contributing

If you find a bot/spider/crawler user agent that CrawlerDetect fails to detect, please submit a pull request with the regex pattern added to the $data array in Fixtures/Crawlers.php and add the failing user agent to tests/crawlers.txt.

Failing that, just create an issue with the user agent you have found, and we'll take it from there :)

Laravel Package

If you would like to use this with Laravel, please see Laravel-Crawler-Detect

Symfony Bundle

To use this library with Symfony 2/3/4, check out the CrawlerDetectBundle.

YII2 Extension

To use this library with the YII2 framework, check out yii2-crawler-detect.

ES6 Library

To use this library with NodeJS or any ES6 application based, check out es6-crawler-detect.

Python Library

To use this library in a Python project, check out crawlerdetect.

JVM Library (written in Java)

To use this library in a JVM project (including Java, Scala, Kotlin, etc.), check out CrawlerDetect.

.NET Library

To use this library in a .net standard (including .net core) based project, check out NetCrawlerDetect.

Ruby Gem

To use this library with Ruby on Rails or any Ruby-based application, check out crawler_detect gem.

Go Module

To use this library with Go, check out the crawlerdetect module.

Parts of this class are based on the brilliant MobileDetect

Analytics

NPM DownloadsLast 30 Days