Top Related Projects
Incredibly fast crawler designed for OSINT.
Fast subdomains enumeration tool for penetration testers
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Hunt down social media accounts by username across social networks
Fast passive subdomain enumeration tool.
Quick Overview
theHarvester is an open-source tool designed for gathering open source intelligence (OSINT) during the early stages of a penetration test or red team engagement. It collects emails, names, subdomains, IPs, and URLs using multiple public data sources.
Pros
- Comprehensive data collection from various sources
- Easy-to-use command-line interface
- Actively maintained and regularly updated
- Supports both passive and active information gathering techniques
Cons
- Some data sources require API keys or subscriptions
- Results may vary depending on the target and available public information
- Can be noisy and potentially detectable by target organizations
- May require additional tools for result analysis and visualization
Getting Started
- Install theHarvester:
git clone https://github.com/laramies/theHarvester.git
cd theHarvester
python3 -m pip install -r requirements/base.txt
- Basic usage:
python3 theHarvester.py -d example.com -b all
This command searches for information related to "example.com" using all available data sources.
- Specify data sources and limit results:
python3 theHarvester.py -d example.com -b google,bing,dnsdumpster -l 100
This command uses Google, Bing, and DNSDumpster as sources, limiting results to 100 entries.
- Save results to files:
python3 theHarvester.py -d example.com -b all -f output_file
This command saves the results to files with the prefix "output_file" in various formats (HTML, XML, JSON).
Competitor Comparisons
Incredibly fast crawler designed for OSINT.
Pros of Photon
- More versatile, capable of crawling websites and extracting various types of data
- Faster execution due to multi-threading and asynchronous requests
- User-friendly command-line interface with customizable options
Cons of Photon
- Limited to web-based information gathering
- May require more setup and dependencies
- Less focused on specific OSINT tasks compared to theHarvester
Code Comparison
Photon:
def photon(url, level, threadCount, delay, timeout, headers):
# Initialization and crawling logic
for url in urls:
# Extract information from each URL
theHarvester:
def start(self):
# Initialization
for source in self.sources:
# Gather information from each source
Summary
Photon is a versatile web crawler and information gathering tool, while theHarvester focuses on collecting email addresses, subdomains, and other specific OSINT data. Photon offers faster execution and more customization options but may require additional setup. theHarvester provides a more targeted approach to OSINT tasks and is easier to use out of the box. The choice between the two depends on the specific requirements of the information gathering task at hand.
Fast subdomains enumeration tool for penetration testers
Pros of Sublist3r
- Faster subdomain enumeration due to multi-threading
- Supports more search engines and sources for subdomain discovery
- Provides a clean, easy-to-read output format
Cons of Sublist3r
- Less actively maintained compared to theHarvester
- Fewer overall features and data sources for information gathering
- Limited to subdomain enumeration, while theHarvester offers broader OSINT capabilities
Code Comparison
Sublist3r:
def main(domain, threads, savefile, ports, silent, verbose, enable_bruteforce, engines):
bruteforce_list = []
subdomains = []
search_list = []
# Rest of the code...
theHarvester:
async def start(self):
self.domain = self.domain.strip()
self.emails = []
self.hosts = []
self.results = []
# Rest of the code...
Both projects use Python and have similar main function structures. However, Sublist3r focuses on subdomain enumeration with multi-threading, while theHarvester has a broader scope for information gathering.
Sublist3r is more specialized for subdomain discovery, making it potentially more efficient for that specific task. theHarvester, on the other hand, offers a wider range of OSINT capabilities, making it more versatile for general reconnaissance.
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
Pros of recon-ng
- More comprehensive and modular framework for reconnaissance
- Supports a wider range of data sources and modules
- Offers a command-line interface with interactive shell capabilities
Cons of recon-ng
- Steeper learning curve due to its more complex structure
- Requires more setup and configuration compared to theHarvester
Code Comparison
theHarvester:
from theHarvester.discovery import *
from theHarvester.discovery.constants import *
search = googlesearch.search_google(word, limit, start)
recon-ng:
from recon.core.module import BaseModule
class Module(BaseModule):
def module_run(self):
self.query('SELECT * FROM domains WHERE domain LIKE ?', ('%{}%'.format(self.options['domain']),))
Both tools are written in Python, but recon-ng has a more structured approach with modules and a core framework. theHarvester focuses on simpler, direct searches using various discovery methods. recon-ng offers a more extensible and customizable platform for reconnaissance tasks, while theHarvester provides a straightforward tool for gathering open-source intelligence.
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Pros of SpiderFoot
- More comprehensive and feature-rich OSINT platform
- User-friendly web interface for easier operation
- Supports a wider range of data sources and modules
Cons of SpiderFoot
- Steeper learning curve due to its complexity
- Requires more system resources to run effectively
Code Comparison
SpiderFoot:
class SpiderFootPlugin(object):
def __init__(self, options):
self._opts = options
def setup(self):
pass
def enrichTarget(self, target):
pass
theHarvester:
class Plugin:
def __init__(self, word):
self.word = word
self.results = []
self.totalresults = []
def do_search(self):
pass
Key Differences
- SpiderFoot offers a more modular and extensible architecture
- theHarvester is more focused on email and domain harvesting
- SpiderFoot provides a broader range of OSINT capabilities
- theHarvester is generally easier to use for beginners
- SpiderFoot has a more active development community
Both tools are valuable for OSINT, but SpiderFoot is more suitable for comprehensive investigations, while theHarvester excels in quick email and domain reconnaissance.
Hunt down social media accounts by username across social networks
Pros of Sherlock
- Focuses specifically on finding usernames across multiple social networks and websites
- Supports a larger number of sites (350+) compared to theHarvester
- Provides a more user-friendly command-line interface with colorful output
Cons of Sherlock
- Limited to username searches, while theHarvester offers broader information gathering capabilities
- May produce more false positives due to its wide-ranging search across numerous platforms
- Lacks some of the advanced features found in theHarvester, such as DNS brute forcing and shodan search
Code Comparison
Sherlock:
def sherlock(username, site_data, timeout=60):
results = {}
for social_network, net_info in site_data.items():
results[social_network] = {"url_main": net_info.get("urlMain")}
url = net_info["url"].format(username)
results[social_network]["url"] = url
results[social_network]["exists"] = "yes"
theHarvester:
async def search(self, domain: str) -> None:
self.domain = domain
url = f'https://api.github.com/search/code?q="{domain}"'
async with aiohttp.ClientSession(headers=self.headers) as session:
async with session.get(url) as resp:
self.results = await resp.json()
Both tools are useful for OSINT purposes, but Sherlock is more specialized for username searches across social media platforms, while theHarvester offers a broader range of information gathering capabilities for domains and organizations.
Fast passive subdomain enumeration tool.
Pros of Subfinder
- Faster subdomain enumeration with concurrent processing
- More extensive list of supported sources for subdomain discovery
- Better integration with other tools in the ProjectDiscovery ecosystem
Cons of Subfinder
- More focused on subdomain enumeration, less versatile for general OSINT
- Steeper learning curve for advanced features and configuration
- May require additional tools for comprehensive information gathering
Code Comparison
TheHarvester:
from theHarvester.discovery import *
from theHarvester.discovery.constants import *
search = googlesearch.search_google(word, limit, start)
Subfinder:
package main
import (
"github.com/projectdiscovery/subfinder/v2/pkg/runner"
)
options := &runner.Options{
Threads: 10,
Timeout: 30,
Sources: []string{"alienvault", "bufferover", "crtsh"},
}
TheHarvester is written in Python and offers a more modular approach for various search engines and data sources. Subfinder, written in Go, focuses on efficient subdomain enumeration with concurrent processing.
Both tools are valuable for reconnaissance, but Subfinder excels in rapid subdomain discovery, while TheHarvester provides a broader range of OSINT capabilities. The choice between them depends on the specific requirements of your information gathering tasks.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
What is this?
theHarvester is a simple to use, yet powerful tool designed to be used during the reconnaissance stage of a red
team assessment or penetration test. It performs open source intelligence (OSINT) gathering to help determine
a domain's external threat landscape. The tool gathers names, emails, IPs, subdomains, and URLs by using
multiple public resources that include:
Passive modules:
-
anubis: Anubis-DB - https://github.com/jonluca/anubis
-
bevigil: CloudSEK BeVigil scans mobile application for OSINT assets (Requires an API key, see below.) - https://bevigil.com/osint-api
-
baidu: Baidu search engine - www.baidu.com
-
binaryedge: List of known subdomains (Requires an API key, see below.) - https://www.binaryedge.io
-
bing: Microsoft search engine - https://www.bing.com
-
bingapi: Microsoft search engine, through the API (Requires an API key, see below.)
-
brave: Brave search engine - https://search.brave.com/
-
bufferoverun: (Requires an API key, see below.) https://tls.bufferover.run
-
censys: Censys search engine will use certificates searches to enumerate subdomains and gather emails
(Requires an API key, see below.) https://censys.io -
certspotter: Cert Spotter monitors Certificate Transparency logs - https://sslmate.com/certspotter/
-
criminalip: Specialized Cyber Threat Intelligence (CTI) search engine (Requires an API key, see below.) - https://www.criminalip.io
-
crtsh: Comodo Certificate search - https://crt.sh
-
dnsdumpster: DNSdumpster search engine - https://dnsdumpster.com
-
duckduckgo: DuckDuckGo search engine - https://duckduckgo.com
-
fullhunt: Next-generation attack surface security platform (Requires an API key, see below.) - https://fullhunt.io
-
github-code: GitHub code search engine (Requires a GitHub Personal Access Token, see below.) - www.github.com
-
hackertarget: Online vulnerability scanners and network intelligence to help organizations - https://hackertarget.com
-
hunter: Hunter search engine (Requires an API key, see below.) - https://hunter.io
-
hunterhow: Internet search engines for security researchers (Requires an API key, see below.) - https://hunter.how
-
intelx: Intelx search engine (Requires an API key, see below.) - http://intelx.io
-
netlas: A Shodan or Censys competitor (Requires an API key, see below.) - https://app.netlas.io
-
onyphe: Cyber defense search engine (Requires an API key, see below.) - https://www.onyphe.io/
-
otx: AlienVault open threat exchange - https://otx.alienvault.com
-
pentestTools: Cloud-based toolkit for offensive security testing, focused on web applications and network penetration
testing (Requires an API key, see below.) - https://pentest-tools.com/ -
projecDiscovery: We actively collect and maintain internet-wide assets data, to enhance research and analyse changes around
DNS for better insights (Requires an API key, see below.) - https://chaos.projectdiscovery.io -
rapiddns: DNS query tool which make querying subdomains or sites of a same IP easy! https://rapiddns.io
-
rocketreach: Access real-time verified personal/professional emails, phone numbers, and social media links (Requires an API key,
see below.) - https://rocketreach.co -
securityTrails: Security Trails search engine, the world's largest repository of historical DNS data (Requires an API key, see
below.) - https://securitytrails.com -
-s, --shodan: Shodan search engine will search for ports and banners from discovered hosts (Requires an API key, see below.)
https://shodan.io -
sitedossier: Find available information on a site - http://www.sitedossier.com
-
subdomaincenter: A subdomain finder tool used to find subdomains of a given domain - https://www.subdomain.center/
-
subdomainfinderc99: A subdomain finder is a tool used to find the subdomains of a given domain - https://subdomainfinder.c99.nl
-
threatminer: Data mining for threat intelligence - https://www.threatminer.org/
-
tomba: Tomba search engine (Requires an API key, see below.) - https://tomba.io
-
urlscan: A sandbox for the web that is a URL and website scanner - https://urlscan.io
-
vhost: Bing virtual hosts search
-
virustotal: Domain search (Requires an API key, see below.) - https://www.virustotal.com
-
yahoo: Yahoo search engine
-
zoomeye: China's version of Shodan (Requires an API key, see below.) - https://www.zoomeye.org
Active modules:
- DNS brute force: dictionary brute force enumeration
- Screenshots: Take screenshots of subdomains that were found
Modules that require an API key:
Documentation to setup API keys can be found at - https://github.com/laramies/theHarvester/wiki/Installation#api-keys
- bevigil - Free upto 50 queries. Pricing can be found here: https://bevigil.com/pricing/osint
- binaryedge - $10/month
- bing
- bufferoverun - uses the free API
- censys - API keys are required and can be retrieved from your Censys account.
- criminalip
- fullhunt
- github
- hunter - limited to 10 on the free plan, so you will need to do -l 10 switch
- hunterhow
- intelx
- netlas - $
- onyphe -$
- pentestTools - $
- projecDiscovery - invite only for now
- rocketreach - $
- securityTrails
- shodan - $
- tomba - Free up to 50 search.
- zoomeye
Install and dependencies:
Comments, bugs, and requests:
- Christian Martorella @laramies cmartorella@edge-security.com
- Matthew Brown @NotoriousRebel1
- Jay "L1ghtn1ng" Townsend @jay_townsend1
Main contributors:
Thanks:
- John Matherly - Shodan project
- Ahmed Aboul Ela - subdomain names dictionaries (big and small)
Top Related Projects
Incredibly fast crawler designed for OSINT.
Fast subdomains enumeration tool for penetration testers
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Hunt down social media accounts by username across social networks
Fast passive subdomain enumeration tool.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot