Top Related Projects
E-mails, subdomains and names Harvester - OSINT
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
Fast subdomains enumeration tool for penetration testers
Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.
Find domains and subdomains related to a given domain
Quick Overview
Photon is an open-source intelligence (OSINT) automation tool designed for fast and comprehensive web reconnaissance. It crawls websites to gather information such as URLs, emails, social media accounts, and more, making it a valuable asset for security researchers and penetration testers.
Pros
- Fast and efficient crawling with multi-threading support
- Extensive data extraction capabilities, including URLs, emails, social media accounts, and more
- Customizable output formats (JSON, CSV, TXT) for easy integration with other tools
- Active development and community support
Cons
- May trigger website security measures if not used carefully
- Requires Python knowledge for advanced customization
- Limited documentation for some advanced features
- Potential for misuse if not used responsibly
Code Examples
- Basic usage to crawl a website:
from photon import Photon
url = "https://example.com"
photon = Photon(url)
photon.crawl()
- Extracting specific data types:
from photon import Photon
url = "https://example.com"
photon = Photon(url)
photon.crawl(extract=['urls', 'emails', 'social'])
- Customizing output format:
from photon import Photon
url = "https://example.com"
photon = Photon(url, output_file="results.json")
photon.crawl(output_format="json")
Getting Started
To get started with Photon, follow these steps:
- Install Photon:
pip install photon-crawler
- Import and use Photon in your Python script:
from photon import Photon
url = "https://example.com"
photon = Photon(url)
photon.crawl()
- Run your script to start crawling and gathering information.
For more advanced usage and configuration options, refer to the official documentation on the GitHub repository.
Competitor Comparisons
E-mails, subdomains and names Harvester - OSINT
Pros of theHarvester
- Broader scope of information gathering, including email addresses, subdomains, and more
- Supports multiple search engines and data sources
- Actively maintained with regular updates
Cons of theHarvester
- Slower execution compared to Photon
- Less focused on web crawling and content extraction
- May require additional dependencies for full functionality
Code Comparison
Photon:
def extract_links(self, soup, name):
links = soup.find_all('a')
for link in links:
href = link.get('href')
if href:
self.links.add(href)
theHarvester:
def get_emails(self):
rawres = myparser.Parser(self.totalresults, self.word)
return rawres.emails()
The code snippets show that Photon focuses on extracting links from web pages, while theHarvester emphasizes parsing and extracting specific information like email addresses from search results.
Both tools serve different purposes within the realm of information gathering and reconnaissance. Photon excels at web crawling and content extraction, while theHarvester offers a broader range of information gathering capabilities across multiple sources.
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Pros of Spiderfoot
- More comprehensive OSINT tool with a wider range of modules and data sources
- Provides a web-based GUI for easier interaction and visualization of results
- Supports automation and integration with other tools through its API
Cons of Spiderfoot
- Steeper learning curve due to its extensive features and configuration options
- Requires more system resources and setup time compared to Photon
Code Comparison
Photon (Python):
def photon(url, level, threadCount):
processed = set()
storage = set()
forms = set()
processed.add(url)
# ... (rest of the function)
Spiderfoot (Python):
class SpiderFootPlugin(object):
def __init__(self, options):
self.sf = SpiderFoot(options)
self.results = dict()
self.errorState = False
# ... (rest of the class)
Both projects are written in Python, but Spiderfoot has a more modular structure with plugins, while Photon has a more straightforward approach. Spiderfoot's code is organized around a plugin system, allowing for easier extensibility, whereas Photon's code is more focused on specific crawling and information gathering tasks.
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
Pros of recon-ng
- More comprehensive and modular framework for reconnaissance
- Supports a wide range of modules for various recon tasks
- Integrates with multiple external APIs and services
Cons of recon-ng
- Steeper learning curve due to its complexity
- Requires more setup and configuration
- May be overkill for simpler web reconnaissance tasks
Code Comparison
Photon example:
from photon import Photon
photon = Photon(url='https://example.com')
photon.crawl()
recon-ng example:
from recon.core.recon import Recon
recon = Recon()
recon.do_load('recon/domains-hosts/google_site_web')
recon.do_run()
Summary
Photon is a lightweight, easy-to-use web crawler and information gathering tool, while recon-ng is a more comprehensive reconnaissance framework. Photon is better suited for quick web scraping and basic information gathering, whereas recon-ng offers a broader range of capabilities for in-depth reconnaissance tasks. The choice between the two depends on the specific requirements of the project and the user's expertise level.
Fast subdomains enumeration tool for penetration testers
Pros of Sublist3r
- Specialized in subdomain enumeration, providing more focused results
- Utilizes multiple search engines and sources for comprehensive subdomain discovery
- Supports multithreading for faster scanning
Cons of Sublist3r
- Limited to subdomain enumeration, lacking broader web crawling capabilities
- Less actively maintained, with fewer recent updates compared to Photon
- Doesn't offer features like data extraction or JavaScript analysis
Code Comparison
Sublist3r (subdomain enumeration):
def main(domain, threads, savefile, ports, silent, verbose, enable_bruteforce, engines):
bruteforce_list = []
subdomains = []
search_list = []
# ... (subdomain enumeration logic)
Photon (web crawling and information gathering):
def photon(seedUrl, headers, depth, threadCount, timeout, delay, cookie):
requests.packages.urllib3.disable_warnings()
dataset = set()
processed = set()
# ... (web crawling and data extraction logic)
The code snippets highlight the different focus areas of each tool. Sublist3r concentrates on subdomain enumeration, while Photon offers broader web crawling and information gathering capabilities.
Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.
Pros of gau
- Faster execution due to its focus on URL discovery
- Supports multiple input formats (stdin, file, URL)
- Can output results in JSON format for easier parsing
Cons of gau
- Limited functionality compared to Photon's broader feature set
- Lacks built-in crawling capabilities
- Does not perform content analysis or extraction
Code Comparison
Photon:
photon = Photon(url, options)
photon.crawl()
photon.extract_info()
photon.store_results()
gau:
urls := gau.GetURLs(domain)
for url := range urls {
fmt.Println(url)
}
Summary
Photon is a more comprehensive web reconnaissance tool that offers crawling, content analysis, and information extraction. It's suitable for in-depth analysis of a target website.
gau focuses specifically on URL discovery, leveraging various sources to find URLs associated with a domain. It's faster and more specialized but lacks the broader feature set of Photon.
Choose Photon for thorough website analysis and information gathering, or gau for rapid URL discovery and enumeration. The selection depends on the specific requirements of your project or security assessment.
Find domains and subdomains related to a given domain
Pros of assetfinder
- Lightweight and fast, focusing solely on subdomain discovery
- Written in Go, making it easy to compile and distribute as a single binary
- Utilizes multiple data sources for comprehensive subdomain enumeration
Cons of assetfinder
- Limited functionality compared to Photon's broader web reconnaissance capabilities
- Lacks the ability to extract additional information like emails, social media accounts, etc.
- Does not perform crawling or content analysis
Code Comparison
assetfinder:
func main() {
domain := flag.String("domain", "", "The domain to find assets for")
flag.Parse()
for result := range assetfinder.Run(*domain) {
fmt.Println(result)
}
}
Photon:
def main():
args = parser.parse_args()
target = args.url
crawl(target, args)
if args.dns:
dnsdumpster(target)
if args.export:
exporter(args.export)
Summary
assetfinder is a focused tool for subdomain discovery, offering speed and simplicity. It's ideal for quick reconnaissance but lacks the comprehensive features of Photon. Photon, on the other hand, provides a more extensive set of web reconnaissance capabilities, including crawling, content analysis, and information extraction. The choice between the two depends on the specific needs of the user and the depth of information required for the task at hand.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Photon
Incredibly fast crawler designed for OSINT.
Met a CAPTCHA? Try CapSolver solving solution.
Photon Wiki ⢠How To Use ⢠Compatibility ⢠Photon Library ⢠Contribution ⢠Roadmap
Key Features
Data Extraction
Photon can extract the following data while crawling:
- URLs (in-scope & out-of-scope)
- URLs with parameters (
example.com/gallery.php?id=2
) - Intel (emails, social media accounts, amazon buckets etc.)
- Files (pdf, png, xml etc.)
- Secret keys (auth/API keys & hashes)
- JavaScript files & Endpoints present in them
- Strings matching custom regex pattern
- Subdomains & DNS related data
The extracted information is saved in an organized manner or can be exported as json.
Flexible
Control timeout, delay, add seeds, exclude URLs matching a regex pattern and other cool stuff. The extensive range of options provided by Photon lets you crawl the web exactly the way you want.
Genius
Photon's smart thread management & refined logic gives you top notch performance.
Still, crawling can be resource intensive but Photon has some tricks up it's sleeves. You can fetch URLs archived by archive.org to be used as seeds by using --wayback
option.
Plugins
Docker
Photon can be launched using a lightweight Python-Alpine (103 MB) Docker image.
$ git clone https://github.com/s0md3v/Photon.git
$ cd Photon
$ docker build -t photon .
$ docker run -it --name photon photon:latest -u google.com
To view results, you can either head over to the local docker volume, which you can find by running docker inspect photon
or by mounting the target loot folder:
$ docker run -it --name photon -v "$PWD:/Photon/google.com" photon:latest -u google.com
Frequent & Seamless Updates
Photon is under heavy development and updates for fixing bugs. optimizing performance & new features are being rolled regularly.
If you would like to see features and issues that are being worked on, you can do that on Development project board.
Updates can be installed & checked for with the --update
option. Photon has seamless update capabilities which means you can update Photon without losing any of your saved data.
Contribution & License
You can contribute in following ways:
- Report bugs
- Develop plugins
- Add more "APIs" for ninja mode
- Give suggestions to make it better
- Fix issues & submit a pull request
Please read the guidelines before submitting a pull request or issue.
Do you want to have a conversation in private? Hit me up on my twitter, inbox is open :)
Photon is licensed under GPL v3.0 license
Top Related Projects
E-mails, subdomains and names Harvester - OSINT
SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.
Fast subdomains enumeration tool for penetration testers
Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.
Find domains and subdomains related to a given domain
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot