Convert Figma logo to code with AI

s0md3v logoPhoton

Incredibly fast crawler designed for OSINT.

11,191
1,532
11,191
51

Top Related Projects

E-mails, subdomains and names Harvester - OSINT

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.

Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.

Fast subdomains enumeration tool for penetration testers

3,923

Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.

Find domains and subdomains related to a given domain

Quick Overview

Photon is an open-source intelligence (OSINT) automation tool designed for fast and comprehensive web reconnaissance. It crawls websites to gather information such as URLs, emails, social media accounts, and more, making it a valuable asset for security researchers and penetration testers.

Pros

  • Fast and efficient crawling with multi-threading support
  • Extensive data extraction capabilities, including URLs, emails, social media accounts, and more
  • Customizable output formats (JSON, CSV, TXT) for easy integration with other tools
  • Active development and community support

Cons

  • May trigger website security measures if not used carefully
  • Requires Python knowledge for advanced customization
  • Limited documentation for some advanced features
  • Potential for misuse if not used responsibly

Code Examples

  1. Basic usage to crawl a website:
from photon import Photon

url = "https://example.com"
photon = Photon(url)
photon.crawl()
  1. Extracting specific data types:
from photon import Photon

url = "https://example.com"
photon = Photon(url)
photon.crawl(extract=['urls', 'emails', 'social'])
  1. Customizing output format:
from photon import Photon

url = "https://example.com"
photon = Photon(url, output_file="results.json")
photon.crawl(output_format="json")

Getting Started

To get started with Photon, follow these steps:

  1. Install Photon:
pip install photon-crawler
  1. Import and use Photon in your Python script:
from photon import Photon

url = "https://example.com"
photon = Photon(url)
photon.crawl()
  1. Run your script to start crawling and gathering information.

For more advanced usage and configuration options, refer to the official documentation on the GitHub repository.

Competitor Comparisons

E-mails, subdomains and names Harvester - OSINT

Pros of theHarvester

  • Broader scope of information gathering, including email addresses, subdomains, and more
  • Supports multiple search engines and data sources
  • Actively maintained with regular updates

Cons of theHarvester

  • Slower execution compared to Photon
  • Less focused on web crawling and content extraction
  • May require additional dependencies for full functionality

Code Comparison

Photon:

def extract_links(self, soup, name):
    links = soup.find_all('a')
    for link in links:
        href = link.get('href')
        if href:
            self.links.add(href)

theHarvester:

def get_emails(self):
    rawres = myparser.Parser(self.totalresults, self.word)
    return rawres.emails()

The code snippets show that Photon focuses on extracting links from web pages, while theHarvester emphasizes parsing and extracting specific information like email addresses from search results.

Both tools serve different purposes within the realm of information gathering and reconnaissance. Photon excels at web crawling and content extraction, while theHarvester offers a broader range of information gathering capabilities across multiple sources.

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.

Pros of Spiderfoot

  • More comprehensive OSINT tool with a wider range of modules and data sources
  • Provides a web-based GUI for easier interaction and visualization of results
  • Supports automation and integration with other tools through its API

Cons of Spiderfoot

  • Steeper learning curve due to its extensive features and configuration options
  • Requires more system resources and setup time compared to Photon

Code Comparison

Photon (Python):

def photon(url, level, threadCount):
    processed = set()
    storage = set()
    forms = set()
    processed.add(url)
    # ... (rest of the function)

Spiderfoot (Python):

class SpiderFootPlugin(object):
    def __init__(self, options):
        self.sf = SpiderFoot(options)
        self.results = dict()
        self.errorState = False
    # ... (rest of the class)

Both projects are written in Python, but Spiderfoot has a more modular structure with plugins, while Photon has a more straightforward approach. Spiderfoot's code is organized around a plugin system, allowing for easier extensibility, whereas Photon's code is more focused on specific crawling and information gathering tasks.

Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.

Pros of recon-ng

  • More comprehensive and modular framework for reconnaissance
  • Supports a wide range of modules for various recon tasks
  • Integrates with multiple external APIs and services

Cons of recon-ng

  • Steeper learning curve due to its complexity
  • Requires more setup and configuration
  • May be overkill for simpler web reconnaissance tasks

Code Comparison

Photon example:

from photon import Photon

photon = Photon(url='https://example.com')
photon.crawl()

recon-ng example:

from recon.core.recon import Recon

recon = Recon()
recon.do_load('recon/domains-hosts/google_site_web')
recon.do_run()

Summary

Photon is a lightweight, easy-to-use web crawler and information gathering tool, while recon-ng is a more comprehensive reconnaissance framework. Photon is better suited for quick web scraping and basic information gathering, whereas recon-ng offers a broader range of capabilities for in-depth reconnaissance tasks. The choice between the two depends on the specific requirements of the project and the user's expertise level.

Fast subdomains enumeration tool for penetration testers

Pros of Sublist3r

  • Specialized in subdomain enumeration, providing more focused results
  • Utilizes multiple search engines and sources for comprehensive subdomain discovery
  • Supports multithreading for faster scanning

Cons of Sublist3r

  • Limited to subdomain enumeration, lacking broader web crawling capabilities
  • Less actively maintained, with fewer recent updates compared to Photon
  • Doesn't offer features like data extraction or JavaScript analysis

Code Comparison

Sublist3r (subdomain enumeration):

def main(domain, threads, savefile, ports, silent, verbose, enable_bruteforce, engines):
    bruteforce_list = []
    subdomains = []
    search_list = []
    
    # ... (subdomain enumeration logic)

Photon (web crawling and information gathering):

def photon(seedUrl, headers, depth, threadCount, timeout, delay, cookie):
    requests.packages.urllib3.disable_warnings()
    dataset = set()
    processed = set()
    
    # ... (web crawling and data extraction logic)

The code snippets highlight the different focus areas of each tool. Sublist3r concentrates on subdomain enumeration, while Photon offers broader web crawling and information gathering capabilities.

3,923

Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.

Pros of gau

  • Faster execution due to its focus on URL discovery
  • Supports multiple input formats (stdin, file, URL)
  • Can output results in JSON format for easier parsing

Cons of gau

  • Limited functionality compared to Photon's broader feature set
  • Lacks built-in crawling capabilities
  • Does not perform content analysis or extraction

Code Comparison

Photon:

photon = Photon(url, options)
photon.crawl()
photon.extract_info()
photon.store_results()

gau:

urls := gau.GetURLs(domain)
for url := range urls {
    fmt.Println(url)
}

Summary

Photon is a more comprehensive web reconnaissance tool that offers crawling, content analysis, and information extraction. It's suitable for in-depth analysis of a target website.

gau focuses specifically on URL discovery, leveraging various sources to find URLs associated with a domain. It's faster and more specialized but lacks the broader feature set of Photon.

Choose Photon for thorough website analysis and information gathering, or gau for rapid URL discovery and enumeration. The selection depends on the specific requirements of your project or security assessment.

Find domains and subdomains related to a given domain

Pros of assetfinder

  • Lightweight and fast, focusing solely on subdomain discovery
  • Written in Go, making it easy to compile and distribute as a single binary
  • Utilizes multiple data sources for comprehensive subdomain enumeration

Cons of assetfinder

  • Limited functionality compared to Photon's broader web reconnaissance capabilities
  • Lacks the ability to extract additional information like emails, social media accounts, etc.
  • Does not perform crawling or content analysis

Code Comparison

assetfinder:

func main() {
    domain := flag.String("domain", "", "The domain to find assets for")
    flag.Parse()
    for result := range assetfinder.Run(*domain) {
        fmt.Println(result)
    }
}

Photon:

def main():
    args = parser.parse_args()
    target = args.url
    crawl(target, args)
    if args.dns:
        dnsdumpster(target)
    if args.export:
        exporter(args.export)

Summary

assetfinder is a focused tool for subdomain discovery, offering speed and simplicity. It's ideal for quick reconnaissance but lacks the comprehensive features of Photon. Photon, on the other hand, provides a more extensive set of web reconnaissance capabilities, including crawling, content analysis, and information extraction. The choice between the two depends on the specific needs of the user and the depth of information required for the task at hand.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README


Photon
Photon

Incredibly fast crawler designed for OSINT.

pypi

Met a CAPTCHA? Try CapSolver solving solution.

demo

Photon Wiki • How To Use • Compatibility • Photon Library • Contribution • Roadmap

Key Features

Data Extraction

Photon can extract the following data while crawling:

  • URLs (in-scope & out-of-scope)
  • URLs with parameters (example.com/gallery.php?id=2)
  • Intel (emails, social media accounts, amazon buckets etc.)
  • Files (pdf, png, xml etc.)
  • Secret keys (auth/API keys & hashes)
  • JavaScript files & Endpoints present in them
  • Strings matching custom regex pattern
  • Subdomains & DNS related data

The extracted information is saved in an organized manner or can be exported as json.

save demo

Flexible

Control timeout, delay, add seeds, exclude URLs matching a regex pattern and other cool stuff. The extensive range of options provided by Photon lets you crawl the web exactly the way you want.

Genius

Photon's smart thread management & refined logic gives you top notch performance.

Still, crawling can be resource intensive but Photon has some tricks up it's sleeves. You can fetch URLs archived by archive.org to be used as seeds by using --wayback option.

Plugins

Docker

Photon can be launched using a lightweight Python-Alpine (103 MB) Docker image.

$ git clone https://github.com/s0md3v/Photon.git
$ cd Photon
$ docker build -t photon .
$ docker run -it --name photon photon:latest -u google.com

To view results, you can either head over to the local docker volume, which you can find by running docker inspect photon or by mounting the target loot folder:

$ docker run -it --name photon -v "$PWD:/Photon/google.com" photon:latest -u google.com

Frequent & Seamless Updates

Photon is under heavy development and updates for fixing bugs. optimizing performance & new features are being rolled regularly.

If you would like to see features and issues that are being worked on, you can do that on Development project board.

Updates can be installed & checked for with the --update option. Photon has seamless update capabilities which means you can update Photon without losing any of your saved data.

Contribution & License

You can contribute in following ways:

  • Report bugs
  • Develop plugins
  • Add more "APIs" for ninja mode
  • Give suggestions to make it better
  • Fix issues & submit a pull request

Please read the guidelines before submitting a pull request or issue.

Do you want to have a conversation in private? Hit me up on my twitter, inbox is open :)

Photon is licensed under GPL v3.0 license