Convert Figma logo to code with AI

lc logogau

Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.

3,923
443
3,923
33

Top Related Projects

Fetch all the URLs that the Wayback Machine knows about for a domain

11,039

A next-generation crawling and spidering framework.

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

Gospider - Fast web spider written in Go

Quick Overview

Gau (Get All URLs) is a command-line tool designed to fetch known URLs from various sources for a given domain. It's particularly useful for web security researchers and penetration testers to quickly gather a comprehensive list of URLs associated with a target domain.

Pros

  • Fast and efficient URL discovery from multiple sources
  • Easy to use with a simple command-line interface
  • Supports output in various formats (JSON, TXT)
  • Can be integrated into other tools and workflows

Cons

  • May produce a large number of results, requiring additional filtering
  • Depends on the availability and accuracy of third-party sources
  • Limited customization options for advanced users
  • Potential for false positives in URL discovery

Getting Started

To install and use gau:

# Install gau
go install github.com/lc/gau/v2/cmd/gau@latest

# Basic usage
gau example.com

# Output to a file
gau example.com -o urls.txt

# Use specific providers
gau example.com -providers wayback,otx,commoncrawl

# Get URLs from a list of domains
cat domains.txt | gau -b png,jpg,gif -o urls.txt

Note: Ensure you have Go installed and your Go bin directory is in your system's PATH.

Competitor Comparisons

Fetch all the URLs that the Wayback Machine knows about for a domain

Pros of waybackurls

  • Simpler and more focused tool, specifically for fetching URLs from the Wayback Machine
  • Lightweight and easy to use with minimal dependencies
  • Can be easily integrated into other tools or scripts

Cons of waybackurls

  • Limited to only the Wayback Machine as a data source
  • Fewer features and customization options compared to gau
  • May retrieve fewer unique URLs for a given domain

Code comparison

waybackurls:

resp, err := http.Get(fmt.Sprintf("http://web.archive.org/cdx/search/cdx?url=%s/*&output=json&fl=original&collapse=urlkey", domain))

gau:

for _, source := range sources {
    urls, err := source.Fetch(ctx, domain, providers)
    if err != nil {
        return fmt.Errorf("error fetching URLs from %s: %s", source.Name(), err)
    }
    for url := range urls {
        results <- url
    }
}

Summary

waybackurls is a straightforward tool focused on retrieving URLs from the Wayback Machine, making it easy to use and integrate. However, it lacks the versatility and extensive features of gau, which can fetch URLs from multiple sources and offers more customization options. gau's code demonstrates its ability to handle multiple data sources, while waybackurls is specifically tailored for the Wayback Machine. Choose waybackurls for simplicity and quick Wayback Machine queries, or opt for gau when you need a more comprehensive URL gathering solution.

11,039

A next-generation crawling and spidering framework.

Pros of Katana

  • More comprehensive crawling capabilities, including JavaScript rendering and form submission
  • Faster crawling speed due to its concurrent design and Go implementation
  • Extensive configuration options for customizing the crawling process

Cons of Katana

  • Higher resource consumption compared to Gau
  • Steeper learning curve due to more complex configuration options
  • May be overkill for simple URL discovery tasks

Code Comparison

Gau usage:

gau example.com

Katana usage:

katana -u https://example.com

Both tools are designed for URL discovery, but Katana offers more advanced features and configuration options. While Gau is simpler and more straightforward to use, Katana provides a more comprehensive crawling solution at the cost of increased complexity and resource usage.

Gau is better suited for quick and lightweight URL discovery tasks, while Katana excels in scenarios requiring deep crawling, JavaScript rendering, and advanced configuration options. The choice between the two depends on the specific requirements of the project and the desired level of crawling depth and customization.

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

Pros of hakrawler

  • Written in Go, offering potentially better performance
  • Supports crawling JavaScript files for additional endpoints
  • Can follow redirects and handle cookies

Cons of hakrawler

  • Limited to crawling a single domain at a time
  • Doesn't offer as many data sources as gau
  • May require more manual configuration for complex scenarios

Code Comparison

hakrawler:

func crawl(url string, depth int) {
    if depth <= 0 {
        return
    }
    // Crawl logic here
}

gau:

func getUrls(domains []string, providers []string, client *http.Client) {
    // URL fetching logic here
}

Key Differences

  • hakrawler focuses on active crawling of websites, while gau retrieves URLs from various sources without crawling
  • gau can process multiple domains simultaneously, whereas hakrawler is designed for single-domain crawling
  • hakrawler provides more granular control over the crawling process, including depth and JavaScript parsing

Use Cases

hakrawler is better suited for:

  • In-depth exploration of a single website
  • Discovering hidden endpoints in JavaScript files
  • Scenarios requiring cookie handling and redirect following

gau is more appropriate for:

  • Quickly gathering URLs from multiple domains
  • Collecting historical URL data from various sources
  • Situations where active crawling is not feasible or desired

Gospider - Fast web spider written in Go

Pros of gospider

  • More comprehensive crawling capabilities, including JavaScript rendering
  • Supports multiple output formats (JSON, Markdown, CSV)
  • Offers more customization options for crawling behavior

Cons of gospider

  • May be slower due to more extensive crawling features
  • Potentially more complex to use for simple URL extraction tasks
  • Requires more system resources for JavaScript rendering

Code comparison

gospider:

crawler := gospider.NewCrawler(
    gospider.WithConcurrency(10),
    gospider.WithDepth(3),
    gospider.WithJSRendering(true),
)

gau:

client := gau.NewClient()
urls, err := client.Fetch(ctx, "example.com")

gospider offers more configuration options and advanced crawling features, while gau provides a simpler interface for quick URL extraction. gospider is better suited for comprehensive web crawling tasks, whereas gau excels at rapid URL discovery from various sources.

gospider's JavaScript rendering capability allows it to discover dynamically generated content, making it more thorough but potentially slower. gau focuses on speed and simplicity, making it ideal for quick reconnaissance or when dealing with large numbers of domains.

Choose gospider for in-depth web crawling and content analysis, and gau for fast URL enumeration and initial reconnaissance tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

getallurls (gau)

License

getallurls (gau) fetches known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, Common Crawl, and URLScan for any given domain. Inspired by Tomnomnom's waybackurls.

Resources

Usage:

Examples:

$ printf example.com | gau
$ cat domains.txt | gau --threads 5
$ gau example.com google.com
$ gau --o example-urls.txt example.com
$ gau --blacklist png,jpg,gif example.com

To display the help for the tool use the -h flag:

$ gau -h
FlagDescriptionExample
--blacklistlist of extensions to skipgau --blacklist ttf,woff,svg,png
--fclist of status codes to filtergau --fc 404,302
--fromfetch urls from date (format: YYYYMM)gau --from 202101
--ftlist of mime-types to filtergau --ft text/plain
--fpremove different parameters of the same endpointgau --fp
--jsonoutput as jsongau --json
--mclist of status codes to matchgau --mc 200,500
--mtlist of mime-types to matchgau --mt text/html,application/json
--ofilename to write results togau --o out.txt
--providerslist of providers to use (wayback,commoncrawl,otx,urlscan)gau --providers wayback
--proxyhttp proxy to use (socks5:// or http://gau --proxy http://proxy.example.com:8080
--retriesretries for HTTP clientgau --retries 10
--timeouttimeout (in seconds) for HTTP clientgau --timeout 60
--subsinclude subdomains of target domaingau example.com --subs
--threadsnumber of workers to spawngau example.com --threads
--tofetch urls to date (format: YYYYMM)gau example.com --to 202101
--verboseshow verbose outputgau --verbose example.com
--versionshow gau versiongau --version

Configuration Files

gau automatically looks for a configuration file at $HOME/.gau.toml or%USERPROFILE%\.gau.toml. You can specify options and they will be used for every subsequent run of gau. Any options provided via command line flags will override options set in the configuration file.

An example configuration file can be found here

Installation:

From source:

$ go install github.com/lc/gau/v2/cmd/gau@latest

From github :

git clone https://github.com/lc/gau.git; \
cd gau/cmd; \
go build; \
sudo mv gau /usr/local/bin/; \
gau --version;

From binary:

You can download the pre-built binaries from the releases page and then move them into your $PATH.

$ tar xvf gau_2.0.6_linux_amd64.tar.gz
$ mv gau /usr/bin/gau

From Docker:

You can run gau via docker like so:

docker run --rm sxcurity/gau:latest --help

You can also build a docker image with the following command

docker build -t gau .

and then run it

docker run gau example.com

Bear in mind that piping command (echo "example.com" | gau) will not work with the docker container

ohmyzsh note:

ohmyzsh's git plugin has an alias which maps gau to the git add --update command. This is problematic, causing a binary conflict between this tool "gau" and the zsh plugin alias "gau" (git add --update). There is currently a few workarounds which can be found in this Github issue.

Useful?

Buy Me A Coffee

Donate to CommonCrawl
Donate to the InternetArchive