Convert Figma logo to code with AI

digininja logoCeWL

CeWL is a Custom Word List Generator

1,899
255
1,899
9

Top Related Projects

56,766

SecLists is the security tester's companion. It's a collection of multiple types of lists used during security assessments, collected in one place. List types include usernames, passwords, URLs, sensitive data patterns, fuzzing payloads, web shells, and many more.

A list of useful payloads and bypass for Web Application Security and Pentest/CTF

1,352

Dictionary of attack patterns and primitives for black-box application fault injection and resource discovery.

A collection of Burpsuite Intruder payloads, BurpBounty payloads, fuzz lists, malicious file uploads and web pentesting methodologies and checklists.

5,110

HTTP parameter discovery suite.

Quick Overview

CeWL (Custom Word List Generator) is a Ruby-based tool designed for creating custom wordlists from target websites. It spiders a given URL to a specified depth, following links within the same domain, and returns a list of words which can be used for password cracking or further analysis.

Pros

  • Customizable depth and word length settings
  • Ability to include meta data in the wordlist
  • Option to follow external links
  • Can output results in various formats (plain text, grepable)

Cons

  • May be slow on large websites or with deep crawling
  • Potential for generating large wordlists that require significant storage
  • Limited to web-based content crawling
  • Requires Ruby environment to run

Getting Started

  1. Install Ruby on your system if not already present.
  2. Clone the repository:
    git clone https://github.com/digininja/CeWL.git
    
  3. Navigate to the CeWL directory:
    cd CeWL
    
  4. Install required gems:
    bundle install
    
  5. Run CeWL:
    ./cewl.rb http://example.com -d 2 -m 5 -w wordlist.txt
    
    This command crawls example.com to a depth of 2, includes words with a minimum length of 5, and saves the output to wordlist.txt.

Competitor Comparisons

56,766

SecLists is the security tester's companion. It's a collection of multiple types of lists used during security assessments, collected in one place. List types include usernames, passwords, URLs, sensitive data patterns, fuzzing payloads, web shells, and many more.

Pros of SecLists

  • Comprehensive collection of multiple types of lists for various security testing scenarios
  • Regularly updated with community contributions
  • Well-organized directory structure for easy navigation

Cons of SecLists

  • Large repository size may be overwhelming for specific use cases
  • Requires manual filtering to find relevant lists for specific tasks
  • Some lists may contain outdated or less relevant entries

Code Comparison

SecLists doesn't contain executable code, as it's primarily a collection of wordlists. CeWL, on the other hand, is a Ruby script for generating custom wordlists. Here's a sample of CeWL's code:

def parse_options(options)
  opts = GetoptLong.new(
    ['--help', '-h', GetoptLong::NO_ARGUMENT],
    ['--depth', '-d', GetoptLong::REQUIRED_ARGUMENT],
    ['--min_word_length', '-m', GetoptLong::REQUIRED_ARGUMENT],
    ['--offsite', '-o', GetoptLong::NO_ARGUMENT],
    ['--write', '-w', GetoptLong::REQUIRED_ARGUMENT]
  )
  # ... (additional code)
end

This code snippet demonstrates CeWL's option parsing functionality, which is not applicable to SecLists as it's a static collection of lists rather than an executable tool.

A list of useful payloads and bypass for Web Application Security and Pentest/CTF

Pros of PayloadsAllTheThings

  • Comprehensive collection of payloads for various security testing scenarios
  • Regularly updated with new techniques and payloads
  • Well-organized structure, making it easy to find specific payloads

Cons of PayloadsAllTheThings

  • May be overwhelming for beginners due to the vast amount of information
  • Lacks specific focus on custom wordlist generation
  • Requires manual selection and implementation of payloads

Code Comparison

While a direct code comparison isn't particularly relevant due to the different nature of these projects, here's a brief example of how they might be used:

CeWL:

./cewl.rb http://example.com -w wordlist.txt -d 2 -m 5

PayloadsAllTheThings:

# No direct execution; users typically copy payloads from the repository
# Example of using an SQL injection payload:
' UNION SELECT username, password FROM users--

CeWL is a command-line tool for generating custom wordlists, while PayloadsAllTheThings is a repository of various payloads for different security testing scenarios. CeWL is more focused and automated for its specific task, while PayloadsAllTheThings offers a wider range of options but requires more manual selection and implementation.

1,352

Dictionary of attack patterns and primitives for black-box application fault injection and resource discovery.

Pros of fuzzdb

  • Comprehensive collection of attack patterns and payloads for various security testing scenarios
  • Regularly updated with new fuzzing data and attack vectors
  • Supports multiple programming languages and platforms

Cons of fuzzdb

  • Larger repository size, which may be overwhelming for beginners
  • Requires more manual effort to integrate into existing security tools
  • Less focused on specific use cases compared to CeWL's custom wordlist generation

Code comparison

CeWL (Ruby):

def parse_page(url, depth)
  if depth > @depth_limit
    return
  end
  page = @spider.get_page(url)
  # ... (word extraction logic)
end

fuzzdb (Various file formats):

# Example from fuzzdb/attack/sql-injection/detect/
UNION SELECT
' UNION SELECT
 UNION ALL SELECT
' UNION ALL SELECT
) UNION SELECT

Note: fuzzdb primarily consists of data files rather than executable code, so a direct code comparison is less applicable.

A collection of Burpsuite Intruder payloads, BurpBounty payloads, fuzz lists, malicious file uploads and web pentesting methodologies and checklists.

Pros of IntruderPayloads

  • Broader scope: Includes various payload types for different attack vectors
  • Regularly updated with new payloads and techniques
  • Organized into categories for easier navigation and selection

Cons of IntruderPayloads

  • Less focused on specific tasks compared to CeWL's word list generation
  • May require more manual filtering to find relevant payloads
  • Larger repository size, potentially overwhelming for beginners

Code Comparison

CeWL (Ruby):

def parse_page(page, url)
  page.body.downcase.scan(@regexp) do |word|
    @words[word.to_s.strip] = true unless word.to_s.strip.empty?
  end
end

IntruderPayloads (Various formats, example in txt):

<script>alert(1)</script>
<img src=x onerror=alert(1)>
"><script>alert(1)</script>

CeWL focuses on extracting words from web pages, while IntruderPayloads provides ready-to-use attack payloads. CeWL's code demonstrates word extraction, whereas IntruderPayloads typically consists of payload lists in various formats.

5,110

HTTP parameter discovery suite.

Pros of Arjun

  • Specialized in parameter discovery for web applications
  • Supports multiple HTTP methods (GET, POST, JSON, etc.)
  • Includes a large built-in parameter wordlist

Cons of Arjun

  • Focused solely on parameter discovery, less versatile than CeWL
  • May generate more network traffic due to its brute-force approach
  • Potentially slower for large-scale scans compared to CeWL's word extraction

Code Comparison

CeWL (Ruby):

def parse_page(page, url)
  page.body.downcase.scan(/[a-z0-9\-_']{#{@min_word_length},}/).each do |word|
    @words[word] = @words.fetch(word, 0) + 1
  end
end

Arjun (Python):

def generate(self):
    for name in self.wordlist:
        yield name
    if self.include_placeholders:
        for placeholder in placeholders:
            yield placeholder

Both tools serve different purposes: CeWL focuses on creating custom wordlists from web content, while Arjun specializes in discovering hidden parameters in web applications. CeWL is more versatile for general wordlist generation, whereas Arjun excels in targeted parameter discovery for web security testing.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

CeWL - Custom Word List generator

Copyright(c) 2024, Robin Wood robin@digi.ninja

Based on a discussion on PaulDotCom (episode 129) about creating custom word lists spidering a targets website and collecting unique words I decided to write CeWL, the Custom Word List generator. CeWL is a ruby app which spiders a given URL to a specified depth, optionally following external links, and returns a list of words which can then be used for password crackers such as John the Ripper.

By default, CeWL sticks to just the site you have specified and will go to a depth of 2 links, this behaviour can be changed by passing arguments. Be careful if setting a large depth and allowing it to go offsite, you could end up drifting on to a lot of other domains. All words of three characters and over are output to stdout. This length can be increased and the words can be written to a file rather than screen so the app can be automated.

CeWL also has an associated command line app, FAB (Files Already Bagged) which uses the same meta data extraction techniques to create author/creator lists from already downloaded.

For anyone running CeWL with Ruby 2.7, you might get some warnings in the style:

.../ruby-2.7.0/gems/mime-types-3.2.2/lib/mime/types/logger.rb:30: warning: `_1' is reserved for numbered parameter; consider another name

This is due to a new feature introduced in 2.7 which conflices with one line of code in the logger script from the mime-types gem. There is an update for it in the gem's repo so hopefully that will be released soon. Till then, as far as I can tell, the warning does not affect CeWL in any way. If, for asthetics, you want to hide the warning, you can run the script as follows:

ruby -W0 ./cewl.rb

Homepage: https://digi.ninja/projects/cewl.php

GitHub: https://github.com/digininja/CeWL

Pronunciation

Seeing as I was asked, CeWL is pronounced "cool".

Installation

CeWL needs the following gems to be installed:

  • mime
  • mime-types
  • mini_exiftool
  • nokogiri
  • rubyzip
  • spider

The easiest way to install these gems is with Bundler:

gem install bundler
bundle install

Alternatively, you can install them manually with:

gem install xxx

The gem mini_exiftool gem also requires the exiftool application to be installed.

Assuming you cloned the GitHub repo, the script should by executable by default, but if not, you can make it executable with:

chmod u+x ./cewl.rb

The project page on my site gives some tips on solving common problems people have encountered while running CeWL - https://digi.ninja/projects/cewl.php

Usage

./cewl.rb

CeWL 5.5.2 (Grouping) Robin Wood (robin@digi.ninja) (https://digi.ninja/)
Usage: cewl [OPTIONS] ... <url>

    OPTIONS:
	-h, --help: Show help.
	-k, --keep: Keep the downloaded file.
	-d <x>,--depth <x>: Depth to spider to, default 2.
	-m, --min_word_length: Minimum word length, default 3.
	-o, --offsite: Let the spider visit other sites.
	-w, --write: Write the output to the file.
	-u, --ua <agent>: User agent to send.
	-n, --no-words: Don't output the wordlist.
	-a, --meta: include meta data.
	--meta_file file: Output file for meta data.
	-e, --email: Include email addresses.
	--email_file <file>: Output file for email addresses.
	--meta-temp-dir <dir>: The temporary directory used by exiftool when parsing files, default /tmp.
	-c, --count: Show the count for each word found.
	-v, --verbose: Verbose.
	--debug: Extra debug information.

	Authentication
	--auth_type: Digest or basic.
	--auth_user: Authentication username.
	--auth_pass: Authentication password.

	Proxy Support
	--proxy_host: Proxy host.
	--proxy_port: Proxy port, default 8080.
	--proxy_username: Username for proxy, if required.
	--proxy_password: Password for proxy, if required.

	Headers
	--header, -H: In format name:value - can pass multiple.

    <url>: The site to spider.

Running CeWL in a Docker container

To quickly use CeWL with Docker, you can use the official ghcr.io/digininja/cewl image:

docker run -it --rm -v "${PWD}:/host" ghcr.io/digininja/cewl [OPTIONS] ... <url>

You can also build it locally:

docker build -t cewl .
docker run -it --rm -v "${PWD}:/host" cewl [OPTIONS] ... <url>

I am going to stress here, I am not going to be offering any support for this. The work was done by @loris-intergalactique who has offered to field any questions on it and give support. I don't use or know Docker, so please, don't ask me for help.

Licence

This project released under the Creative Commons Attribution-Share Alike 2.0 UK: England & Wales

http://creativecommons.org/licenses/by-sa/2.0/uk/

Alternatively, you can use GPL-3+ instead the of the original license.

http://opensource.org/licenses/GPL-3.0