Convert Figma logo to code with AI

laramies logotheHarvester

E-mails, subdomains and names Harvester - OSINT

11,764
2,055
11,764
28

Top Related Projects

11,191

Incredibly fast crawler designed for OSINT.

Fast subdomains enumeration tool for penetration testers

Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.

59,570

Hunt down social media accounts by username across social networks

10,098

Fast passive subdomain enumeration tool.

Quick Overview

theHarvester is an open-source tool designed for gathering open source intelligence (OSINT) during the early stages of a penetration test or red team engagement. It collects emails, names, subdomains, IPs, and URLs using multiple public data sources.

Pros

  • Comprehensive data collection from various sources
  • Easy-to-use command-line interface
  • Actively maintained and regularly updated
  • Supports both passive and active information gathering techniques

Cons

  • Some data sources require API keys or subscriptions
  • Results may vary depending on the target and available public information
  • Can be noisy and potentially detectable by target organizations
  • May require additional tools for result analysis and visualization

Getting Started

  1. Install theHarvester:
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
python3 -m pip install -r requirements/base.txt
  1. Basic usage:
python3 theHarvester.py -d example.com -b all

This command searches for information related to "example.com" using all available data sources.

  1. Specify data sources and limit results:
python3 theHarvester.py -d example.com -b google,bing,dnsdumpster -l 100

This command uses Google, Bing, and DNSDumpster as sources, limiting results to 100 entries.

  1. Save results to files:
python3 theHarvester.py -d example.com -b all -f output_file

This command saves the results to files with the prefix "output_file" in various formats (HTML, XML, JSON).

Competitor Comparisons

11,191

Incredibly fast crawler designed for OSINT.

Pros of Photon

  • More versatile, capable of crawling websites and extracting various types of data
  • Faster execution due to multi-threading and asynchronous requests
  • User-friendly command-line interface with customizable options

Cons of Photon

  • Limited to web-based information gathering
  • May require more setup and dependencies
  • Less focused on specific OSINT tasks compared to theHarvester

Code Comparison

Photon:

def photon(url, level, threadCount, delay, timeout, headers):
    # Initialization and crawling logic
    for url in urls:
        # Extract information from each URL

theHarvester:

def start(self):
    # Initialization
    for source in self.sources:
        # Gather information from each source

Summary

Photon is a versatile web crawler and information gathering tool, while theHarvester focuses on collecting email addresses, subdomains, and other specific OSINT data. Photon offers faster execution and more customization options but may require additional setup. theHarvester provides a more targeted approach to OSINT tasks and is easier to use out of the box. The choice between the two depends on the specific requirements of the information gathering task at hand.

Fast subdomains enumeration tool for penetration testers

Pros of Sublist3r

  • Faster subdomain enumeration due to multi-threading
  • Supports more search engines and sources for subdomain discovery
  • Provides a clean, easy-to-read output format

Cons of Sublist3r

  • Less actively maintained compared to theHarvester
  • Fewer overall features and data sources for information gathering
  • Limited to subdomain enumeration, while theHarvester offers broader OSINT capabilities

Code Comparison

Sublist3r:

def main(domain, threads, savefile, ports, silent, verbose, enable_bruteforce, engines):
    bruteforce_list = []
    subdomains = []
    search_list = []
    
    # Rest of the code...

theHarvester:

async def start(self):
    self.domain = self.domain.strip()
    self.emails = []
    self.hosts = []
    self.results = []
    
    # Rest of the code...

Both projects use Python and have similar main function structures. However, Sublist3r focuses on subdomain enumeration with multi-threading, while theHarvester has a broader scope for information gathering.

Sublist3r is more specialized for subdomain discovery, making it potentially more efficient for that specific task. theHarvester, on the other hand, offers a wider range of OSINT capabilities, making it more versatile for general reconnaissance.

Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.

Pros of recon-ng

  • More comprehensive and modular framework for reconnaissance
  • Supports a wider range of data sources and modules
  • Offers a command-line interface with interactive shell capabilities

Cons of recon-ng

  • Steeper learning curve due to its more complex structure
  • Requires more setup and configuration compared to theHarvester

Code Comparison

theHarvester:

from theHarvester.discovery import *
from theHarvester.discovery.constants import *
search = googlesearch.search_google(word, limit, start)

recon-ng:

from recon.core.module import BaseModule
class Module(BaseModule):
    def module_run(self):
        self.query('SELECT * FROM domains WHERE domain LIKE ?', ('%{}%'.format(self.options['domain']),))

Both tools are written in Python, but recon-ng has a more structured approach with modules and a core framework. theHarvester focuses on simpler, direct searches using various discovery methods. recon-ng offers a more extensible and customizable platform for reconnaissance tasks, while theHarvester provides a straightforward tool for gathering open-source intelligence.

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.

Pros of SpiderFoot

  • More comprehensive and feature-rich OSINT platform
  • User-friendly web interface for easier operation
  • Supports a wider range of data sources and modules

Cons of SpiderFoot

  • Steeper learning curve due to its complexity
  • Requires more system resources to run effectively

Code Comparison

SpiderFoot:

class SpiderFootPlugin(object):
    def __init__(self, options):
        self._opts = options

    def setup(self):
        pass

    def enrichTarget(self, target):
        pass

theHarvester:

class Plugin:
    def __init__(self, word):
        self.word = word
        self.results = []
        self.totalresults = []

    def do_search(self):
        pass

Key Differences

  • SpiderFoot offers a more modular and extensible architecture
  • theHarvester is more focused on email and domain harvesting
  • SpiderFoot provides a broader range of OSINT capabilities
  • theHarvester is generally easier to use for beginners
  • SpiderFoot has a more active development community

Both tools are valuable for OSINT, but SpiderFoot is more suitable for comprehensive investigations, while theHarvester excels in quick email and domain reconnaissance.

59,570

Hunt down social media accounts by username across social networks

Pros of Sherlock

  • Focuses specifically on finding usernames across multiple social networks and websites
  • Supports a larger number of sites (350+) compared to theHarvester
  • Provides a more user-friendly command-line interface with colorful output

Cons of Sherlock

  • Limited to username searches, while theHarvester offers broader information gathering capabilities
  • May produce more false positives due to its wide-ranging search across numerous platforms
  • Lacks some of the advanced features found in theHarvester, such as DNS brute forcing and shodan search

Code Comparison

Sherlock:

def sherlock(username, site_data, timeout=60):
    results = {}
    for social_network, net_info in site_data.items():
        results[social_network] = {"url_main": net_info.get("urlMain")}
        url = net_info["url"].format(username)
        results[social_network]["url"] = url
        results[social_network]["exists"] = "yes"

theHarvester:

async def search(self, domain: str) -> None:
    self.domain = domain
    url = f'https://api.github.com/search/code?q="{domain}"'
    async with aiohttp.ClientSession(headers=self.headers) as session:
        async with session.get(url) as resp:
            self.results = await resp.json()

Both tools are useful for OSINT purposes, but Sherlock is more specialized for username searches across social media platforms, while theHarvester offers a broader range of information gathering capabilities for domains and organizations.

10,098

Fast passive subdomain enumeration tool.

Pros of Subfinder

  • Faster subdomain enumeration with concurrent processing
  • More extensive list of supported sources for subdomain discovery
  • Better integration with other tools in the ProjectDiscovery ecosystem

Cons of Subfinder

  • More focused on subdomain enumeration, less versatile for general OSINT
  • Steeper learning curve for advanced features and configuration
  • May require additional tools for comprehensive information gathering

Code Comparison

TheHarvester:

from theHarvester.discovery import *
from theHarvester.discovery.constants import *
search = googlesearch.search_google(word, limit, start)

Subfinder:

package main

import (
    "github.com/projectdiscovery/subfinder/v2/pkg/runner"
)

options := &runner.Options{
    Threads: 10,
    Timeout: 30,
    Sources: []string{"alienvault", "bufferover", "crtsh"},
}

TheHarvester is written in Python and offers a more modular approach for various search engines and data sources. Subfinder, written in Go, focuses on efficient subdomain enumeration with concurrent processing.

Both tools are valuable for reconnaissance, but Subfinder excels in rapid subdomain discovery, while TheHarvester provides a broader range of OSINT capabilities. The choice between them depends on the specific requirements of your information gathering tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

theHarvester

TheHarvester CI TheHarvester Docker Image CI Rawsec's CyberSecurity Inventory

What is this?

theHarvester is a simple to use, yet powerful tool designed to be used during the reconnaissance stage of a red
team assessment or penetration test. It performs open source intelligence (OSINT) gathering to help determine
a domain's external threat landscape. The tool gathers names, emails, IPs, subdomains, and URLs by using
multiple public resources that include:

Passive modules:

Active modules:

  • DNS brute force: dictionary brute force enumeration
  • Screenshots: Take screenshots of subdomains that were found

Modules that require an API key:

Documentation to setup API keys can be found at - https://github.com/laramies/theHarvester/wiki/Installation#api-keys

  • bevigil - Free upto 50 queries. Pricing can be found here: https://bevigil.com/pricing/osint
  • binaryedge - $10/month
  • bing
  • bufferoverun - uses the free API
  • censys - API keys are required and can be retrieved from your Censys account.
  • criminalip
  • fullhunt
  • github
  • hunter - limited to 10 on the free plan, so you will need to do -l 10 switch
  • hunterhow
  • intelx
  • netlas - $
  • onyphe -$
  • pentestTools - $
  • projecDiscovery - invite only for now
  • rocketreach - $
  • securityTrails
  • shodan - $
  • tomba - Free up to 50 search.
  • zoomeye

Install and dependencies:

Comments, bugs, and requests:

Main contributors:

  • Twitter Follow Matthew Brown @NotoriousRebel1
  • Twitter Follow Jay "L1ghtn1ng" Townsend @jay_townsend1
  • Twitter Follow Lee Baird @discoverscripts

Thanks:

  • John Matherly - Shodan project
  • Ahmed Aboul Ela - subdomain names dictionaries (big and small)