Paperless vs Paperless

Detailed comparison of features, pros, cons, and usage

Paperless-ng (the-paperless-project/paperless-ng) is a more actively maintained and feature-rich fork of the original Paperless project (the-paperless-project/paperless), offering improved document management capabilities and a modernized user interface, though it may have a steeper learning curve for new users.

Paperless

Scan, index, and archive all of your paper documents

7,879

Paperless

Scan, index, and archive all of your paper documents

7,879

Paperless Pros and Cons

Pros

Document Organization: Efficiently digitizes and organizes paper documents, making them easily searchable and accessible.
Open Source: Free to use and customize, with an active community contributing to its development.
Automation: Utilizes OCR and tagging systems to automatically categorize and index documents.
Self-Hosted: Offers complete control over data and privacy by allowing users to host the system on their own servers.

Cons

Setup Complexity: Initial setup and configuration can be challenging for users without technical expertise.
Hardware Requirements: Requires dedicated hardware or a always-on computer to run effectively.
Learning Curve: May take time to fully understand and utilize all features and capabilities.
Maintenance: Requires regular updates and maintenance to ensure optimal performance and security.

Paperless Pros and Cons

Pros

Document Organization: Efficiently digitizes and organizes paper documents, making them easily searchable and accessible.
Open Source: Free to use and customize, with an active community contributing to its development.
Automation: Utilizes OCR and tagging systems to automatically categorize and index documents.
Self-Hosted: Offers complete control over data and privacy by allowing users to host the system on their own servers.

Cons

Setup Complexity: Initial setup and configuration can be challenging for users without technical expertise.
Hardware Requirements: Requires dedicated hardware or a always-on computer to run effectively.
Learning Curve: May take time to fully understand and utilize all features and capabilities.
Maintenance: Requires regular updates and maintenance to ensure optimal performance and security.

Paperless Code Examples

Document Consumption

This snippet shows how Paperless processes and consumes documents:

def consume_file(self, path, override_filename=None, override_title=None,
                 override_correspondent_id=None, override_document_type_id=None,
                 override_tag_ids=None, override_created=None, override_asn=None):
    document = self.try_consume_file(
        path, override_filename, override_title,
        override_correspondent_id, override_document_type_id,
        override_tag_ids, override_created, override_asn)
    if document:
        return document

OCR Processing

This snippet demonstrates how Paperless performs OCR on documents:

def ocr(self, input_file, output_file, language, safe_fallback=True):
    import ocrmypdf

    args = [
        "--language", language,
        "--output-type", "pdf",
        "--sidecar", output_file + ".txt",
        "--skip-text",
        "--deskew",
        "--clean",
        "--rotate-pages",
        input_file,
        output_file
    ]

Document Searching

This snippet shows how Paperless implements document searching:

class DocumentIndex(SearchIndex):
    text = CharField(document=True, use_template=True)
    title = EdgeNgramField(model_attr="title", boost=1.5)
    content = CharField(model_attr="content")
    created = DateTimeField(model_attr="created")
    modified = DateTimeField(model_attr="modified")
    tags = MultiValueField()
    correspondent = CharField(model_attr="correspondent__name", null=True)

Paperless Code Examples

Document Consumption

This snippet shows how Paperless processes and consumes documents:

def consume_file(self, path, override_filename=None, override_title=None,
                 override_correspondent_id=None, override_document_type_id=None,
                 override_tag_ids=None, override_created=None, override_asn=None):
    document = self.try_consume_file(
        path, override_filename, override_title,
        override_correspondent_id, override_document_type_id,
        override_tag_ids, override_created, override_asn)
    if document:
        return document

OCR Processing

This snippet demonstrates how Paperless performs OCR on documents:

def ocr(self, input_file, output_file, language, safe_fallback=True):
    import ocrmypdf

    args = [
        "--language", language,
        "--output-type", "pdf",
        "--sidecar", output_file + ".txt",
        "--skip-text",
        "--deskew",
        "--clean",
        "--rotate-pages",
        input_file,
        output_file
    ]

Document Searching

This snippet shows how Paperless implements document searching:

class DocumentIndex(SearchIndex):
    text = CharField(document=True, use_template=True)
    title = EdgeNgramField(model_attr="title", boost=1.5)
    content = CharField(model_attr="content")
    created = DateTimeField(model_attr="created")
    modified = DateTimeField(model_attr="modified")
    tags = MultiValueField()
    correspondent = CharField(model_attr="correspondent__name", null=True)

Paperless Quick Start

Installation

Clone the repository:

git clone https://github.com/the-paperless-project/paperless.git
cd paperless

Install dependencies:
```
pip install -r requirements.txt
```
Set up the database:
```
python3 manage.py migrate
```
Create a superuser:
```
python3 manage.py createsuperuser
```

Basic Usage

Start the Paperless server:
```
python3 manage.py runserver
```
Open your web browser and navigate to http://localhost:8000
Log in with the superuser credentials you created earlier

To add a document, use the following command:

python3 manage.py document_consumer /path/to/your/document.pdf

Refresh the web interface to see your newly added document

Example: Searching for Documents

Once you've added some documents, you can search for them using the web interface or the command line:

python3 manage.py document_search "search term"

This will return a list of documents matching the search term.

Paperless Quick Start

Installation

Clone the repository:

git clone https://github.com/the-paperless-project/paperless.git
cd paperless

Install dependencies:
```
pip install -r requirements.txt
```
Set up the database:
```
python3 manage.py migrate
```
Create a superuser:
```
python3 manage.py createsuperuser
```

Basic Usage

Start the Paperless server:
```
python3 manage.py runserver
```
Open your web browser and navigate to http://localhost:8000
Log in with the superuser credentials you created earlier

To add a document, use the following command:

python3 manage.py document_consumer /path/to/your/document.pdf

Refresh the web interface to see your newly added document

Example: Searching for Documents

Once you've added some documents, you can search for them using the web interface or the command line:

python3 manage.py document_search "search term"

This will return a list of documents matching the search term.

Top Related Projects

paperless-ng

5,389

A supercharged version of paperless: scan, index and archive all your physical documents

Pros of Paperless-ng

Modern web-based user interface with improved document management features
Enhanced search capabilities, including full-text search and custom fields
Automated document classification and tagging using machine learning

Cons of Paperless-ng

Potentially higher system requirements due to additional features
Steeper learning curve for users familiar with the original Paperless

Code Comparison

Paperless:

# views.py
class IndexView(TemplateView):
    template_name = "index.html"

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)
        context['documents'] = Document.objects.all()
        return context

Paperless-ng:

# views.py
class IndexView(LoginRequiredMixin, TemplateView):
    template_name = "index.html"

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)
        context['documents'] = Document.objects.filter(owner=self.request.user)
        return context

The Paperless-ng code adds user authentication and filters documents by owner, improving security and multi-user support.

paperless-ngx

28,602

A community-supported supercharged document management system: scan, index and archive all your documents

Pros of paperless-ngx

More active development and frequent updates
Enhanced user interface with modern design
Improved document processing and management features

Cons of paperless-ngx

Potential compatibility issues with older paperless setups
Steeper learning curve for users familiar with the original paperless

Code Comparison

paperless:

# Example from paperless
class Document(models.Model):
    correspondent = models.ForeignKey(
        Correspondent, blank=True, null=True, on_delete=models.SET_NULL
    )
    title = models.CharField(max_length=128, blank=True, db_index=True)

paperless-ngx:

# Example from paperless-ngx
class Document(models.Model):
    correspondent = models.ForeignKey(
        Correspondent, related_name="documents", on_delete=models.SET_NULL, null=True, blank=True
    )
    title = models.CharField(max_length=128, blank=True, db_index=True)

The code comparison shows minor differences in model definitions, with paperless-ngx including a related_name parameter in the ForeignKey field.

Both projects aim to provide a paperless document management system, but paperless-ngx offers a more modern and actively maintained solution. While it may require some adjustment for users of the original paperless, the improved features and ongoing development make it an attractive option for those seeking a robust paperless document management system.