Top Related Projects
Quick Overview
dupeGuru is an open-source tool designed to find duplicate files on your computer. It uses a variety of algorithms to identify duplicates, even if they have different names or are located in different folders. The tool supports multiple file types and offers a user-friendly interface for managing and deleting duplicate files.
Pros
- Cross-platform compatibility (Windows, macOS, Linux)
- Supports various file types, including music files, pictures, and regular files
- Customizable scanning options and filtering capabilities
- Offers both GUI and command-line interfaces
Cons
- May require some technical knowledge for advanced features
- Performance can slow down with very large file sets
- Occasional false positives in certain edge cases
- Limited integration with cloud storage services
Getting Started
- Download the latest release for your operating system from the GitHub releases page.
- Install the application following the instructions for your platform.
- Launch dupeGuru and select the scan type (Standard, Music, or Picture).
- Add folders to scan by clicking the "+" button.
- Click "Scan" to start the duplicate search process.
- Review the results and select duplicates for deletion or other actions.
For command-line usage on Unix-like systems:
# Install from source
git clone https://github.com/arsenetar/dupeguru.git
cd dupeguru
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python build.py
# Run dupeGuru
python run.py
Note: The exact installation and running process may vary depending on your operating system and preferred installation method.
Competitor Comparisons
Multi functional app to find duplicates, empty folders, similar images etc.
Pros of czkawka
- Written in Rust, potentially offering better performance and memory safety
- Supports more file types for comparison, including music files
- Offers a command-line interface in addition to GUI
Cons of czkawka
- Less mature project with fewer contributors
- May have a steeper learning curve for non-technical users
- GUI interface is less polished compared to dupeguru
Code comparison
czkawka (Rust):
pub fn find_duplicates(
directories: &[String],
recursive: bool,
excluded_directories: &[String],
minimal_file_size: u64,
) -> Vec<Vec<FileEntry>> {
// Implementation details
}
dupeguru (Python):
def get_dupe_groups(directories, callback=None, *args, **kwargs):
scanner = Scanner()
scanner.set_directories(directories)
scanner.set_scan_type(kwargs.get('scan_type', ScanType.Contents))
return scanner.get_dupe_groups(callback)
Both projects aim to find duplicate files, but czkawka's implementation in Rust may offer performance benefits. dupeguru's Python code appears more concise and potentially easier to read for those familiar with the language. czkawka's function signature suggests more granular control over the scanning process, while dupeguru's approach seems more high-level and abstracted.
FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
Pros of fdupes
- Lightweight and fast, focusing solely on duplicate file detection
- Command-line interface allows for easy integration into scripts and automation
- Supports recursive directory scanning and hardlink creation
Cons of fdupes
- Limited file comparison options compared to dupeguru
- Lacks a graphical user interface, which may be less user-friendly for some
- Does not support content-aware duplicate detection for certain file types
Code Comparison
fdupes:
if (checktree(files, &files) == 0) {
errormsg("no duplicates found.");
exit(0);
}
dupeguru:
def get_dupe_groups(directories, *args, **kwargs):
scanner = ScannerImpl(*args, **kwargs)
return scanner.get_dupe_groups(directories)
Summary
fdupes is a lightweight, command-line tool focused on efficient duplicate file detection, while dupeguru offers a more comprehensive solution with a graphical interface and advanced file comparison options. fdupes is better suited for users comfortable with command-line operations and scripting, while dupeguru provides a more user-friendly experience with additional features for content-aware duplicate detection.
Extremely fast tool to remove duplicates and other lint from your filesystem
Pros of rmlint
- Written in C, potentially offering better performance for large-scale operations
- Supports a wider range of duplicate detection methods, including content-based and metadata-based
- Provides a command-line interface, making it suitable for scripting and automation
Cons of rmlint
- Less user-friendly for non-technical users compared to dupeguru's GUI
- May require more setup and configuration for optimal use
- Limited cross-platform support (primarily Linux-focused)
Code Comparison
rmlint (C):
gint64 rm_file_size(const char *path) {
struct stat stat_buf;
if(stat(path, &stat_buf) == -1) {
return -1;
}
return stat_buf.st_size;
}
dupeguru (Python):
def get_file_size(path):
try:
return os.path.getsize(path)
except OSError:
return 0
Both functions aim to retrieve file sizes, but rmlint's implementation in C offers more direct system call usage, potentially providing better performance for large-scale operations. However, dupeguru's Python implementation is more concise and leverages high-level language features for simplicity.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
dupeGuru
dupeGuru is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in a system. It is written mostly in Python 3 and uses qt for the UI.
Current status
Still looking for additional help especially with regards to:
- OSX maintenance: reproducing bugs, packaging verification.
- Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
- Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
- Documentation: keeping it up-to-date.
Contents of this folder
This folder contains the source for dupeGuru. Its documentation is in help
, but is also
available online in its built form. Here's how this source tree is organized:
- core: Contains the core logic code for dupeGuru. It's Python code.
- qt: UI code for the Qt toolkit. It's written in Python and uses PyQt.
- images: Images used by the different UI codebases.
- pkg: Skeleton files required to create different packages
- help: Help document, written for Sphinx.
- locale: .po files for localization.
- hscommon: A collection of helpers used across HS applications.
How to build dupeGuru from source
Windows & macOS specific additional instructions
For windows instructions see the Windows Instructions.
For macos instructions (qt version) see the macOS Instructions.
Prerequisites
- Python 3.7+
- PyQt5
System Setup
When running in a linux based environment the following system packages or equivalents are needed to build:
- python3-pyqt5
- pyqt5-dev-tools (on some systems, see note)
- python3-venv (only if using a virtual environment)
- python3-dev
- build-essential
Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with which pyrcc5
. Debian based systems need the extra package, and Arch does not.
To create packages the following are also needed:
- python3-setuptools
- debhelper
Building with Make
dupeGuru comes with a makefile that can be used to build and run:
$ make && make run
Building without Make
$ cd <dupeGuru directory>
$ python3 -m venv --system-site-packages ./env
$ source ./env/bin/activate
$ pip install -r requirements.txt
$ python build.py
$ python run.py
Generating Debian/Ubuntu package
To generate packages the extra requirements in requirements-extra.txt must be installed, the steps are as follows:
$ cd <dupeGuru directory>
$ python3 -m venv --system-site-packages ./env
$ source ./env/bin/activate
$ pip install -r requirements.txt -r requirements-extra.txt
$ python build.py --clean
$ python package.py
This can be made a one-liner (once in the directory) as:
$ bash -c "python3 -m venv --system-site-packages env && source env/bin/activate && pip install -r requirements.txt -r requirements-extra.txt && python build.py --clean && python package.py"
Running tests
The complete test suite is run with Tox 1.7+. If you have it installed system-wide, you
don't even need to set up a virtualenv. Just cd
into the root project folder and run tox
.
If you don't have Tox system-wide, install it in your virtualenv with pip install tox
and then
run tox
.
You can also run automated tests without Tox. Extra requirements for running tests are in
requirements-extra.txt
. So, you can do pip install -r requirements-extra.txt
inside your
virtualenv and then py.test core hscommon
Top Related Projects
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot