Convert Figma logo to code with AI

arsenetar logodupeguru

Find duplicate files

5,530
419
5,530
438

Top Related Projects

20,742

Multi functional app to find duplicates, empty folders, similar images etc.

2,550

FDUPES is a program for identifying or deleting duplicate files residing within specified directories.

1,967

Extremely fast tool to remove duplicates and other lint from your filesystem

An ultra fast cross-platform multiple screenshots module in pure Python using ctypes.

Quick Overview

dupeGuru is an open-source tool designed to find duplicate files on your computer. It uses a variety of algorithms to identify duplicates, even if they have different names or are located in different folders. The tool supports multiple file types and offers a user-friendly interface for managing and deleting duplicate files.

Pros

  • Cross-platform compatibility (Windows, macOS, Linux)
  • Supports various file types, including music files, pictures, and regular files
  • Customizable scanning options and filtering capabilities
  • Offers both GUI and command-line interfaces

Cons

  • May require some technical knowledge for advanced features
  • Performance can slow down with very large file sets
  • Occasional false positives in certain edge cases
  • Limited integration with cloud storage services

Getting Started

  1. Download the latest release for your operating system from the GitHub releases page.
  2. Install the application following the instructions for your platform.
  3. Launch dupeGuru and select the scan type (Standard, Music, or Picture).
  4. Add folders to scan by clicking the "+" button.
  5. Click "Scan" to start the duplicate search process.
  6. Review the results and select duplicates for deletion or other actions.

For command-line usage on Unix-like systems:

# Install from source
git clone https://github.com/arsenetar/dupeguru.git
cd dupeguru
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python build.py

# Run dupeGuru
python run.py

Note: The exact installation and running process may vary depending on your operating system and preferred installation method.

Competitor Comparisons

20,742

Multi functional app to find duplicates, empty folders, similar images etc.

Pros of czkawka

  • Written in Rust, potentially offering better performance and memory safety
  • Supports more file types for comparison, including music files
  • Offers a command-line interface in addition to GUI

Cons of czkawka

  • Less mature project with fewer contributors
  • May have a steeper learning curve for non-technical users
  • GUI interface is less polished compared to dupeguru

Code comparison

czkawka (Rust):

pub fn find_duplicates(
    directories: &[String],
    recursive: bool,
    excluded_directories: &[String],
    minimal_file_size: u64,
) -> Vec<Vec<FileEntry>> {
    // Implementation details
}

dupeguru (Python):

def get_dupe_groups(directories, callback=None, *args, **kwargs):
    scanner = Scanner()
    scanner.set_directories(directories)
    scanner.set_scan_type(kwargs.get('scan_type', ScanType.Contents))
    return scanner.get_dupe_groups(callback)

Both projects aim to find duplicate files, but czkawka's implementation in Rust may offer performance benefits. dupeguru's Python code appears more concise and potentially easier to read for those familiar with the language. czkawka's function signature suggests more granular control over the scanning process, while dupeguru's approach seems more high-level and abstracted.

2,550

FDUPES is a program for identifying or deleting duplicate files residing within specified directories.

Pros of fdupes

  • Lightweight and fast, focusing solely on duplicate file detection
  • Command-line interface allows for easy integration into scripts and automation
  • Supports recursive directory scanning and hardlink creation

Cons of fdupes

  • Limited file comparison options compared to dupeguru
  • Lacks a graphical user interface, which may be less user-friendly for some
  • Does not support content-aware duplicate detection for certain file types

Code Comparison

fdupes:

if (checktree(files, &files) == 0) {
  errormsg("no duplicates found.");
  exit(0);
}

dupeguru:

def get_dupe_groups(directories, *args, **kwargs):
    scanner = ScannerImpl(*args, **kwargs)
    return scanner.get_dupe_groups(directories)

Summary

fdupes is a lightweight, command-line tool focused on efficient duplicate file detection, while dupeguru offers a more comprehensive solution with a graphical interface and advanced file comparison options. fdupes is better suited for users comfortable with command-line operations and scripting, while dupeguru provides a more user-friendly experience with additional features for content-aware duplicate detection.

1,967

Extremely fast tool to remove duplicates and other lint from your filesystem

Pros of rmlint

  • Written in C, potentially offering better performance for large-scale operations
  • Supports a wider range of duplicate detection methods, including content-based and metadata-based
  • Provides a command-line interface, making it suitable for scripting and automation

Cons of rmlint

  • Less user-friendly for non-technical users compared to dupeguru's GUI
  • May require more setup and configuration for optimal use
  • Limited cross-platform support (primarily Linux-focused)

Code Comparison

rmlint (C):

gint64 rm_file_size(const char *path) {
    struct stat stat_buf;
    if(stat(path, &stat_buf) == -1) {
        return -1;
    }
    return stat_buf.st_size;
}

dupeguru (Python):

def get_file_size(path):
    try:
        return os.path.getsize(path)
    except OSError:
        return 0

Both functions aim to retrieve file sizes, but rmlint's implementation in C offers more direct system call usage, potentially providing better performance for large-scale operations. However, dupeguru's Python implementation is more concise and leverages high-level language features for simplicity.

An ultra fast cross-platform multiple screenshots module in pure Python using ctypes.

Pros of python-mss

  • Cross-platform support for taking screenshots
  • Lightweight and fast performance
  • Simple API for easy integration

Cons of python-mss

  • Limited to screenshot functionality only
  • Lacks advanced image processing features
  • No built-in duplicate detection capabilities

Code Comparison

python-mss:

import mss

with mss.mss() as sct:
    sct.shot()

dupeguru:

from dupeguru.core import Engine

engine = Engine()
engine.add_directory("/path/to/directory")
engine.scan()

Summary

python-mss is a focused tool for capturing screenshots across different platforms, offering simplicity and speed. dupeguru, on the other hand, is a comprehensive duplicate file finder with advanced features for detecting and managing duplicate files.

While python-mss excels in its specific task, it lacks the broader functionality of dupeguru, which includes file analysis, duplicate detection, and management tools. The choice between the two depends on the specific needs of the project: screenshot capture vs. duplicate file management.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

dupeGuru

dupeGuru is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in a system. It is written mostly in Python 3 and uses qt for the UI.

Current status

Still looking for additional help especially with regards to:

  • OSX maintenance: reproducing bugs, packaging verification.
  • Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
  • Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
  • Documentation: keeping it up-to-date.

Contents of this folder

This folder contains the source for dupeGuru. Its documentation is in help, but is also available online in its built form. Here's how this source tree is organized:

  • core: Contains the core logic code for dupeGuru. It's Python code.
  • qt: UI code for the Qt toolkit. It's written in Python and uses PyQt.
  • images: Images used by the different UI codebases.
  • pkg: Skeleton files required to create different packages
  • help: Help document, written for Sphinx.
  • locale: .po files for localization.
  • hscommon: A collection of helpers used across HS applications.

How to build dupeGuru from source

Windows & macOS specific additional instructions

For windows instructions see the Windows Instructions.

For macos instructions (qt version) see the macOS Instructions.

Prerequisites

System Setup

When running in a linux based environment the following system packages or equivalents are needed to build:

  • python3-pyqt5
  • pyqt5-dev-tools (on some systems, see note)
  • python3-venv (only if using a virtual environment)
  • python3-dev
  • build-essential

Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with which pyrcc5. Debian based systems need the extra package, and Arch does not.

To create packages the following are also needed:

  • python3-setuptools
  • debhelper

Building with Make

dupeGuru comes with a makefile that can be used to build and run:

$ make && make run

Building without Make

$ cd <dupeGuru directory>
$ python3 -m venv --system-site-packages ./env
$ source ./env/bin/activate
$ pip install -r requirements.txt
$ python build.py
$ python run.py

Generating Debian/Ubuntu package

To generate packages the extra requirements in requirements-extra.txt must be installed, the steps are as follows:

$ cd <dupeGuru directory>
$ python3 -m venv --system-site-packages ./env
$ source ./env/bin/activate
$ pip install -r requirements.txt -r requirements-extra.txt
$ python build.py --clean
$ python package.py

This can be made a one-liner (once in the directory) as:

$ bash -c "python3 -m venv --system-site-packages env && source env/bin/activate && pip install -r requirements.txt -r requirements-extra.txt && python build.py --clean && python package.py"

Running tests

The complete test suite is run with Tox 1.7+. If you have it installed system-wide, you don't even need to set up a virtualenv. Just cd into the root project folder and run tox.

If you don't have Tox system-wide, install it in your virtualenv with pip install tox and then run tox.

You can also run automated tests without Tox. Extra requirements for running tests are in requirements-extra.txt. So, you can do pip install -r requirements-extra.txt inside your virtualenv and then py.test core hscommon