Convert Figma logo to code with AI

phiresky logoripgrep-all

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.

6,527
154
6,527
37

Top Related Projects

47,483

ripgrep recursively searches directories for a regex pattern while respecting your gitignore

A code-searching tool similar to ack, but faster.

33,285

A simple, fast and user-friendly alternative to 'find'

63,665

:cherry_blossom: A command-line fuzzy finder

2,956

:mag: A simple, fast fuzzy finder for the terminal

2,567

NEW ugrep 6.5: a more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more

Quick Overview

ripgrep-all (rga) is a command-line search tool that combines the functionality of ripgrep with the ability to search within various file types. It can search through PDFs, E-Books, Office documents, zip files, and more, making it a versatile tool for developers and system administrators.

Pros

  • Searches through a wide variety of file types, including archives and documents
  • Built on top of ripgrep, inheriting its speed and efficiency
  • Supports parallel searching for improved performance
  • Customizable and extensible through adapters

Cons

  • Requires additional dependencies for certain file types
  • May have a steeper learning curve compared to simpler search tools
  • Can be resource-intensive when searching large archives or complex file types
  • Not as widely adopted as some other search tools

Getting Started

To install ripgrep-all, follow these steps:

  1. Install Rust and Cargo if not already installed:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    
  2. Install ripgrep-all using Cargo:

    cargo install ripgrep-all
    
  3. Ensure necessary dependencies are installed for file type support (e.g., poppler for PDF support).

  4. Basic usage:

    rga "search pattern" /path/to/search
    

For more detailed instructions and options, refer to the project's GitHub repository.

Competitor Comparisons

47,483

ripgrep recursively searches directories for a regex pattern while respecting your gitignore

Pros of ripgrep

  • Faster and more efficient for searching plain text files
  • Simpler to use for basic text searches
  • Lighter resource usage, especially for large codebases

Cons of ripgrep

  • Limited to searching plain text files
  • Doesn't support searching within compressed files or other formats (PDFs, eBooks, etc.)
  • Lacks advanced features for searching non-text content

Code comparison

ripgrep:

rg "pattern" /path/to/search

ripgrep-all:

rga "pattern" /path/to/search
rga --rga-adapters=+pdfpages "pattern" document.pdf

Summary

ripgrep is a fast and efficient tool for searching plain text files, making it ideal for developers working with source code. It's simpler to use and has lighter resource usage compared to ripgrep-all.

ripgrep-all, on the other hand, extends ripgrep's functionality to search within various file formats, including PDFs, eBooks, and compressed files. This makes it more versatile for users who need to search across different file types, but at the cost of increased complexity and potentially slower performance for plain text searches.

Choose ripgrep for fast, simple text searches in codebases, and ripgrep-all for more comprehensive searches across various file formats.

A code-searching tool similar to ack, but faster.

Pros of The Silver Searcher

  • Faster than traditional tools like grep and ack for searching large codebases
  • Automatically ignores files/directories specified in .gitignore
  • Simple and intuitive command-line interface

Cons of The Silver Searcher

  • Limited file type support compared to ripgrep-all
  • Lacks advanced features for searching within compressed files or documents
  • May not be as actively maintained as ripgrep-all

Code Comparison

The Silver Searcher:

ag "pattern" /path/to/search

ripgrep-all:

rga "pattern" /path/to/search

Both tools offer similar basic usage, but ripgrep-all provides additional options for searching within various file types:

rga --rga-adapters=+pdfpages "pattern" /path/to/search

This command enables searching within PDF files, which is not natively supported by The Silver Searcher.

While The Silver Searcher is a solid choice for fast code searching, ripgrep-all offers more extensive file type support and advanced features for searching within compressed and non-text files. The Silver Searcher may be preferable for simpler use cases or when working with primarily text-based codebases, while ripgrep-all shines in more diverse file environments.

33,285

A simple, fast and user-friendly alternative to 'find'

Pros of fd

  • Faster and more user-friendly alternative to the find command
  • Colorized output and smart case sensitivity by default
  • Supports parallel command execution for improved performance

Cons of fd

  • Limited to filename searches, unlike ripgrep-all which searches file contents
  • Doesn't support searching within compressed files or documents
  • Less extensive file type support compared to ripgrep-all

Code Comparison

fd:

fd -e txt
fd '^foo.*bar$'
fd -exec echo {}

ripgrep-all:

rga 'pattern' --rga-adapters=+pdfpages,tesseract
rga -i 'search term' --type pdf
rga 'keyword' --pre rga-preproc-*

Key Differences

  • fd focuses on fast file name searches, while ripgrep-all specializes in searching file contents across various formats
  • ripgrep-all offers more extensive file type support, including PDFs, compressed files, and other document formats
  • fd provides simpler syntax for common file search tasks, whereas ripgrep-all offers more advanced search capabilities

Use Cases

fd is ideal for:

  • Quick file name searches
  • Simple file operations based on name patterns
  • Replacing basic find command usage

ripgrep-all excels at:

  • Searching content within various file types
  • Complex search patterns across multiple file formats
  • Extracting information from compressed or non-text files
63,665

:cherry_blossom: A command-line fuzzy finder

Pros of fzf

  • More versatile: Can be used for general-purpose fuzzy finding beyond just file search
  • Highly customizable with extensive options and keybindings
  • Integrates well with various command-line tools and text editors

Cons of fzf

  • Primarily focused on interactive searching, less suited for programmatic use
  • Doesn't provide built-in content searching capabilities like ripgrep-all

Code comparison

fzf:

find * -type f | fzf > selected

ripgrep-all:

rga --rga-adapters=+pdfpages "search term" ./

Key differences

  • Purpose: fzf is a general-purpose fuzzy finder, while ripgrep-all is specialized for searching file contents across various formats
  • Functionality: ripgrep-all focuses on searching file contents, including PDFs and other non-text formats, while fzf excels at interactive filtering of input
  • Use cases: fzf is often used for command-line history searching and file navigation, while ripgrep-all is better suited for content-based searches in large codebases or document collections

Both tools have their strengths and can be complementary in a developer's toolkit, with fzf offering interactive filtering capabilities and ripgrep-all providing powerful content searching across multiple file types.

2,956

:mag: A simple, fast fuzzy finder for the terminal

Pros of fzy

  • Lightweight and fast fuzzy finder
  • Simple to use and integrate into existing workflows
  • Written in C, making it highly portable and efficient

Cons of fzy

  • Limited to fuzzy finding functionality only
  • Doesn't support searching within file contents
  • Less feature-rich compared to ripgrep-all

Code comparison

fzy:

int match(const char *needle, const char *haystack) {
    while (*needle && *haystack) {
        if (tolower(*needle) == tolower(*haystack++))
            needle++;
    }
    return *needle == '\0';
}

ripgrep-all:

fn search_zip(path: &Path, matcher: &Matcher) -> Result<Vec<Match>, Error> {
    let file = File::open(path)?;
    let mut archive = zip::ZipArchive::new(file)?;
    let mut matches = Vec::new();
    // ... (additional code for searching within zip files)
}

Summary

fzy is a lightweight fuzzy finder focused on simplicity and speed, while ripgrep-all is a more comprehensive search tool that can search within various file types. fzy excels in quick filename searches, whereas ripgrep-all offers broader functionality for searching file contents across multiple formats.

2,567

NEW ugrep 6.5: a more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more

Pros of ugrep

  • Supports a wider range of file formats and compression types natively
  • Offers more advanced search options and regular expression features
  • Generally faster for searching large codebases or complex file structures

Cons of ugrep

  • Steeper learning curve due to more complex syntax and options
  • May be overkill for simple search tasks or smaller projects
  • Less integration with modern development workflows compared to ripgrep-all

Code Comparison

ugrep:

ugrep -Q 'pattern' --include='*.{cpp,h}' --exclude-dir='.git' .

ripgrep-all:

rga -i 'pattern' --type cpp --glob '!.git' .

Both tools offer powerful search capabilities, but ugrep provides more granular control over search parameters and file types. ripgrep-all, built on top of ripgrep, offers a simpler interface and better integration with modern development tools, making it more accessible for many users. The choice between the two depends on the specific needs of the project and the user's familiarity with advanced search techniques.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.

rga is a line-oriented search tool that allows you to look for a regex in a multitude of file types. rga wraps the awesome ripgrep and enables it to search in pdf, docx, sqlite, jpg, movie subtitles (mkv, mp4), etc.

github repo Crates.io fearless concurrency

For more detail, see this introductory blogpost: https://phiresky.github.io/blog/2019/rga--ripgrep-for-zip-targz-docx-odt-epub-jpg/

rga will recursively descend into archives and match text in every file type it knows.

Here is an example directory with different file types:

demo/
├── greeting.mkv
├── hello.odt
├── hello.sqlite3
└── somearchive.zip
├── dir
│ ├── greeting.docx
│ └── inner.tar.gz
│ └── greeting.pdf
└── greeting.epub

rga output

Integration with fzf

rga-fzf

See the wiki for instructions of integrating rga with fzf.

INSTALLATION

Linux x64, macOS and Windows binaries are available in GitHub Releases.

Linux

Arch Linux

pacman -S ripgrep-all

Nix

nix-env -iA nixpkgs.ripgrep-all

Debian-based

download the rga binary and get the dependencies like this:

apt install ripgrep pandoc poppler-utils ffmpeg

If ripgrep is not included in your package sources, get it from here.

rga will search for all binaries it calls in $PATH and the directory itself is in.

Windows

Note that installing via chocolatey or scoop is the only supported download method. If you download the binary from releases manually, you will not get the dependencies (for example pdftotext from poppler).

If you get an error like VCRUNTIME140.DLL could not be found, you need to install vc_redist.x64.exe.

Chocolatey

choco install ripgrep-all

Scoop

scoop install rga

Homebrew/Linuxbrew

rga can be installed with Homebrew:

brew install rga

To install the dependencies that are each not strictly necessary but very useful:

brew install pandoc poppler ffmpeg

MacPorts

rga can also be installed on macOS via MacPorts:

sudo port install ripgrep-all

Compile from source

rga should compile with stable Rust (v1.75.0+, check with rustc --version). To build it, run the following (or the equivalent in your OS):

~$ apt install build-essential pandoc poppler-utils ffmpeg ripgrep cargo
~$ cargo install --locked ripgrep_all
~$ rga --version    # this should work now

Available Adapters

rga works with adapters that adapt various file formats. It comes with a few adapters integrated:

rga --rga-list-adapters

You can also add custom adapters. See the wiki for more information.

Adapters:

  • pandoc Uses pandoc to convert binary/unreadable text documents to plain markdown-like text Runs: pandoc --from= --to=plain --wrap=none --markdown-headings=atx
    Extensions: .epub, .odt, .docx, .fb2, .ipynb, .html, .htm

  • poppler Uses pdftotext (from poppler-utils) to extract plain text from PDF files Runs: pdftotext - -
    Extensions: .pdf
    Mime Types: application/pdf

  • postprocpagebreaks Adds the page number to each line for an input file that specifies page breaks as ascii page break character. Mainly to be used internally by the poppler adapter.
    Extensions: .asciipagebreaks

  • ffmpeg Uses ffmpeg to extract video metadata/chapters, subtitles, lyrics, and other metadata
    Extensions: .mkv, .mp4, .avi, .mp3, .ogg, .flac, .webm

  • zip Reads a zip file as a stream and recurses down into its contents
    Extensions: .zip, .jar
    Mime Types: application/zip

  • decompress Reads compressed file as a stream and runs a different extractor on the contents.
    Extensions: .als, .bz2, .gz, .tbz, .tbz2, .tgz, .xz, .zst
    Mime Types: application/gzip, application/x-bzip, application/x-xz, application/zstd

  • tar Reads a tar file as a stream and recurses down into its contents
    Extensions: .tar

  • sqlite Uses sqlite bindings to convert sqlite databases into a simple plain text format
    Extensions: .db, .db3, .sqlite, .sqlite3
    Mime Types: application/x-sqlite3

The following adapters are disabled by default, and can be enabled using '--rga-adapters=+foo,bar':

  • mail Reads mailbox/mail files and runs extractors on the contents and attachments.
    Extensions: .mbox, .mbx, .eml
    Mime Types: application/mbox, message/rfc822

USAGE:

rga [RGA OPTIONS] [RG OPTIONS] PATTERN [PATH ...]

FLAGS:

--rga-accurate

Use more accurate but slower matching by mime type

By default, rga will match files using file extensions. Some programs, such as sqlite3, don't care about the file extension at all, so users sometimes use any or no extension at all. With this flag, rga will try to detect the mime type of input files using the magic bytes (similar to the `file` utility), and use that to choose the adapter. Detection is only done on the first 8KiB of the file, since we can't always seek on the input (in archives).

--rga-no-cache

Disable caching of results

By default, rga caches the extracted text, if it is small enough, to a database in ${XDG_CACHE_DIR-~/.cache}/ripgrep-all on Linux, ~/Library/Caches/ripgrep-all on macOS, or C:\Users\username\AppData\Local\ripgrep-all on Windows. This way, repeated searches on the same set of files will be much faster. If you pass this flag, all caching will be disabled.

-h, --help

Prints help information

--rga-list-adapters

List all known adapters

--rga-print-config-schema

Print the JSON Schema of the configuration file

--rg-help

Show help for ripgrep itself

--rg-version

Show version of ripgrep itself

-V, --version

Prints version information

OPTIONS:

--rga-adapters=<adapters>...

Change which adapters to use and in which priority order (descending)

"foo,bar" means use only adapters foo and bar. "-bar,baz" means use all default adapters except for bar and baz. "+bar,baz" means use all default adapters and also bar and baz.

--rga-cache-compression-level=<compression-level>

ZSTD compression level to apply to adapter outputs before storing in cache db

Ranges from 1 - 22 [default: 12]

--rga-config-file=<config-file-path>

--rga-max-archive-recursion=<max-archive-recursion>

Maximum nestedness of archives to recurse into [default: 5]

--rga-cache-max-blob-len=<max-blob-len>

Max compressed size to cache

Longest byte length (after compression) to store in cache. Longer adapter outputs will not be cached and recomputed every time.

Allowed suffixes on command line: k M G [default: 2000000]

--rga-cache-path=<path>

Path to store cache db [default: /home/phire/.cache/ripgrep-all]

-h shows a concise overview, --help shows more detail and advanced options.

All other options not shown here are passed directly to rg, especially [PATTERN] and [PATH ...]

Config

The config file location leverage the mechanisms defined by

Development

To enable debug logging:

export RUST_LOG=debug
export RUST_BACKTRACE=1

Also remember to disable caching with --rga-no-cache or clear the cache (~/Library/Caches/rga on macOS, ~/.cache/rga on other Unixes, or C:\Users\username\AppData\Local\rga on Windows) to debug the adapters.

Nix and Direnv

You can use the provided flake.nix to setup all build- and run-time dependencies:

  1. Enable Flakes in your Nix configuration.
  2. Add direnv to your profile: nix profile install nixpkgs#direnv
  3. cd into the directory where you have cloned this directory.
  4. Allow use of .envrc: direnv allow
  5. After the dependencies have been installed, your shell will now have all of the necessary development dependencies.