Gau vs Katana
Detailed comparison of features, pros, cons, and usage
gau is a lightweight tool focused specifically on fetching URLs from various sources, while katana is a more comprehensive web crawling and spidering framework with additional features, making gau simpler and faster for basic URL discovery but katana more powerful for in-depth web application analysis.
Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.
Gau Pros and Cons
Pros
- Fast and efficient URL fetching from various sources
- Supports multiple input formats and output options
- Easy to integrate into existing workflows and scripts
- Actively maintained with regular updates and improvements
Cons
- Limited to URL discovery, not a full-featured web crawler
- May require additional tools for more comprehensive web reconnaissance
- Can potentially generate a large amount of data, requiring careful management
- Learning curve for advanced features and optimal usage
Katana Pros and Cons
Pros
- Fast and efficient web crawling: Katana is designed for high-speed crawling and spidering of web applications, making it suitable for large-scale reconnaissance tasks.
- Customizable and extensible: The tool offers various configuration options and supports custom scripts, allowing users to tailor the crawling process to their specific needs.
- Integration with other tools: Katana can be easily integrated into existing security workflows and pipelines, enhancing the overall effectiveness of security assessments.
- Active development and community support: The project is actively maintained and has a growing community, ensuring regular updates and improvements.
Cons
- Learning curve: Users may need to invest time in understanding the tool's configuration options and best practices to fully leverage its capabilities.
- Potential for aggressive crawling: If not properly configured, Katana's high-speed crawling could potentially overwhelm target systems or trigger security alerts.
- Limited built-in analysis features: While excellent for crawling and data collection, Katana may require additional tools for in-depth analysis of the gathered information.
- Dependency on external libraries: The tool relies on various external dependencies, which may require additional setup or maintenance in some environments.
Gau Code Examples
Basic Usage
This snippet demonstrates how to use gau to fetch URLs from a domain:
gau example.com
gau example.com example2.com
gau --domains-file domains.txt
Advanced Filtering
Here's an example of using gau with advanced filtering options:
gau --fc 404 --ft js,php example.com
gau --subs --blacklist /wp-content/,/wp-includes/ example.com
gau --random-agent --limit 1000 example.com
Katana Code Examples
Basic Usage
This snippet demonstrates how to use Katana to crawl a website and output the results:
package main
import "github.com/projectdiscovery/katana/pkg/engine/standard"
func main() {
crawler, _ := standard.New()
defer crawler.Close()
results, _ := crawler.Crawl("https://example.com")
for result := range results {
println(result.URL)
}
}
Custom Configuration
Here's an example of configuring Katana with custom options:
options := &standard.Options{
MaxDepth: 3,
FieldScope: "rdn",
BodyReadSize: 2 * 1024 * 1024,
RateLimit: 100,
Timeout: 10,
Retries: 2,
Concurrency: 10,
Parallelism: 10,
DelayJitter: 100,
StoreResponse: true,
OnlyResponsive: true,
}
crawler, _ := standard.New(standard.WithOptions(options))
Gau Quick Start
Installation
To install gau, follow these steps:
-
Ensure you have Go installed on your system.
-
Run the following command to install gau:
go install github.com/lc/gau/v2/cmd/gau@latest
- Verify the installation by running:
gau --version
Basic Usage
Here's a quick guide to get started with gau:
- To fetch URLs for a single domain:
gau example.com
- To fetch URLs for multiple domains:
gau example.com example2.com
- To save the output to a file:
gau example.com -o output.txt
- To use specific providers (e.g., wayback, otx, commoncrawl):
gau --providers wayback,otx example.com
- To filter results by specific extensions:
gau example.com --fc jpg,png,gif
For more advanced usage and options, refer to the project's documentation or run gau --help
.
Katana Quick Start
Installation
To install Katana, follow these steps:
-
Ensure you have Go 1.20 or later installed on your system.
-
Run the following command to install Katana:
go install github.com/projectdiscovery/katana/cmd/katana@latest
- Verify the installation by running:
katana -version
Basic Usage
Here's a simple example to get you started with Katana:
- To crawl a single URL:
katana -u https://example.com
- To crawl multiple URLs from a file:
katana -list urls.txt
- To save the results to a file:
katana -u https://example.com -output results.txt
- To use custom headers:
katana -u https://example.com -H "User-Agent: Mozilla/5.0" -H "Cookie: session=123456"
For more advanced usage and options, refer to the official documentation.
Top Related Projects
Fetch all the URLs that the Wayback Machine knows about for a domain
Pros of waybackurls
- Simple and lightweight tool focused on a single task
- Easy to use with minimal setup required
- Integrates well with Unix pipelines and other tools
Cons of waybackurls
- Limited functionality compared to gau and katana
- Lacks advanced filtering options
- May miss some URLs that more comprehensive tools can find
Code Comparison
waybackurls:
resp, err := http.Get(fmt.Sprintf("http://web.archive.org/cdx/search/cdx?url=%s/*&output=json&fl=original&collapse=urlkey", url))
gau:
for _, source := range sources {
urls, err := source.Fetch(ctx, domain, &opts)
if err != nil {
return err
}
for url := range urls {
// Process URL
}
}
katana:
crawler, err := crawler.New(&crawler.Options{
MaxDepth: options.MaxDepth,
FieldScope: options.FieldScope,
BodyReadSize: options.BodyReadSize,
// ... other options
})
The code snippets show that waybackurls focuses on fetching URLs from the Wayback Machine, while gau and katana offer more comprehensive crawling and URL discovery capabilities. gau supports multiple sources, and katana provides advanced crawling options, reflecting their broader functionality compared to waybackurls' simpler approach.
Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application
Pros of hakrawler
- Lightweight and fast, with minimal dependencies
- Supports custom headers and cookies for authenticated crawling
- Provides flexible output options (JSON, plain text, etc.)
Cons of hakrawler
- Limited depth control compared to Katana's advanced crawling features
- Lacks some advanced filtering options available in gau
- May miss some URLs that gau can discover through passive sources
Code Comparison
hakrawler:
func crawl(url string, depth int, c *colly.Collector) {
c.Visit(url)
}
gau:
func getUrls(domains []string, providers []string, client *http.Client) {
for _, domain := range domains {
for _, provider := range providers {
provider.Fetch(domain, client)
}
}
}
Katana:
func (c *Crawler) Start() error {
c.navigateURL(c.options.URL, nil)
return c.wait()
}
All three tools serve similar purposes but with different approaches. hakrawler is simpler and faster, gau focuses on passive URL discovery, while Katana offers more advanced crawling features. The choice depends on specific use cases and requirements.
Gospider - Fast web spider written in Go
Pros of gospider
- Supports crawling JavaScript files and extracting URLs from them
- Offers concurrent crawling for faster performance
- Provides flexible output options, including JSON and Markdown formats
Cons of gospider
- May have a steeper learning curve compared to gau and katana
- Less frequent updates and maintenance than the other two projects
- Limited built-in filtering options compared to katana
Code Comparison
gospider:
func (s *Spider) Start() error {
s.pool.Run()
var wg sync.WaitGroup
for _, site := range s.C.Sites {
wg.Add(1)
go func(site string) {
defer wg.Done()
s.crawl(site)
}(site)
}
wg.Wait()
return nil
}
gau:
func getUrls(domains []string, providers []string, threads int) {
var wg sync.WaitGroup
for _, domain := range domains {
wg.Add(1)
go func(domain string) {
defer wg.Done()
for _, provider := range providers {
getUrlsProvider(domain, provider)
}
}(domain)
}
wg.Wait()
}
katana:
func (c *Crawler) Start() error {
for _, seed := range c.options.URLs {
if err := c.AddSeed(seed); err != nil {
return err
}
}
c.startWorkers()
return nil
}
Fast web fuzzer written in Go
Pros of ffuf
- Fast and efficient web fuzzing tool
- Highly customizable with numerous options for fine-tuning
- Supports multiple input sources and output formats
Cons of ffuf
- Limited to web fuzzing, not as versatile as Katana or gau
- Requires more manual configuration compared to gau's simplicity
- May generate more noise in results compared to Katana's targeted approach
Code Comparison
ffuf
matcher := ffuf.NewMatcher(options.Matchers, options.Filters, options.MatcherOptions)
for _, word := range wordlist {
result := ffuf.NewResult(url, word, matcher)
if result.IsValid() {
output.WriteResult(result)
}
}
gau
func fetchURLs(domain string, providers []string) ([]string, error) {
var urls []string
for _, provider := range providers {
providerURLs, err := fetchFromProvider(domain, provider)
if err != nil {
return nil, err
}
urls = append(urls, providerURLs...)
}
return urls, nil
}
Katana
func (c *Crawler) Start() error {
for {
select {
case <-c.options.Context.Done():
return nil
case url := <-c.queue:
c.processURL(url)
}
}
}
httpx is a fast and multi-purpose HTTP toolkit that allows running multiple probes using the retryablehttp library.
Pros of httpx
- Fast and efficient HTTP probing and analysis
- Supports multiple protocols (HTTP, HTTPS, HTTP/2)
- Extensive output customization options
Cons of httpx
- Limited URL discovery capabilities compared to gau
- Less comprehensive web crawling features than katana
- Focused primarily on HTTP interactions, not full-scale web exploration
Code Comparison
gau:
urls := make(chan string)
go func() {
gau.FromGoogleAnalytics(domain, urls)
gau.FromWaybackMachine(domain, urls)
close(urls)
}()
katana:
crawler, _ := katana.New(&katana.Options{
Depth: 3,
MaxURLs: 1000,
Concurrency: 10,
})
results := crawler.Crawl(url)
httpx:
httpxRunner, _ := httpx.New(&httpx.Options{
Methods: "GET",
Threads: 50,
})
result, _ := httpxRunner.URL(url)
Summary
httpx excels in fast HTTP probing and analysis, supporting multiple protocols with customizable output. However, it lacks the extensive URL discovery capabilities of gau and the comprehensive web crawling features of katana. gau focuses on URL discovery from various sources, while katana offers in-depth web crawling functionality. httpx is best suited for HTTP-specific tasks, whereas gau and katana are more specialized for URL discovery and web crawling, respectively.
Directory/File, DNS and VHost busting tool written in Go
Pros of gobuster
- Simple and straightforward CLI tool for directory/file, DNS, and VHost busting
- Supports multiple wordlists and customizable output formats
- Lightweight and fast, with minimal dependencies
Cons of gobuster
- Limited to specific enumeration tasks (directories, DNS, VHosts)
- Lacks advanced crawling and content discovery features
- May require additional tools for comprehensive web reconnaissance
Code Comparison
gobuster:
func main() {
globalopts, err := parseGlobalOpts()
if err != nil {
fmt.Fprintf(os.Stderr, "%s\n", err)
os.Exit(1)
}
if globalopts.NoStatus {
libgobuster.Ruler = false
}
// ... (additional setup and execution code)
}
gau:
func main() {
flag.Parse()
if *version {
fmt.Printf("gau version: %s\n", Version)
os.Exit(0)
}
// ... (URL processing and output handling)
}
katana:
func main() {
options := &types.Options{}
parseOptions(options)
runner, err := runner.New(options)
if err != nil {
gologger.Fatal().Msgf("Could not create runner: %s\n", err)
}
runner.Run()
}
While gobuster focuses on specific enumeration tasks, gau specializes in URL discovery from various sources, and katana offers more comprehensive web crawling and content discovery features. Each tool serves different purposes within the web reconnaissance ecosystem.