Convert Figma logo to code with AI

mozilla logobleach

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

2,646
251
2,646
9

Top Related Projects

13,621

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:

5,195

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

Ruby HTML and CSS sanitizer.

1,133

Caja is a tool for safely embedding third party HTML, CSS and JavaScript in your website.

Quick Overview

Bleach is a Python library for sanitizing and cleaning HTML input. It helps prevent XSS attacks by stripping out potentially malicious content while allowing safe HTML tags and attributes. Bleach is designed to be flexible and customizable, making it suitable for a wide range of web applications.

Pros

  • Highly customizable, allowing fine-grained control over allowed tags and attributes
  • Actively maintained and widely used in production environments
  • Supports both Python 2 and Python 3
  • Extensive documentation and good community support

Cons

  • Can be slower than some alternatives for large inputs
  • May require careful configuration to balance security and functionality
  • Limited to HTML sanitization, not a general-purpose security solution
  • Learning curve for advanced customization

Code Examples

  1. Basic HTML sanitization:
import bleach

html = "<p>This is <script>alert('dangerous');</script> content.</p>"
clean_html = bleach.clean(html)
print(clean_html)
# Output: <p>This is alert('dangerous'); content.</p>
  1. Customizing allowed tags and attributes:
import bleach

html = "<p style='color:red;'>Colored <strong>text</strong> and <a href='#'>link</a>.</p>"
clean_html = bleach.clean(html, tags=['p', 'strong'], attributes={'p': ['style']})
print(clean_html)
# Output: <p style='color:red;'>Colored <strong>text</strong> and link.</p>
  1. Linkifying text:
import bleach
from bleach.linkifier import LinkifyFilter

text = "Check out https://example.com for more info."
linkified = bleach.linkify(text)
print(linkified)
# Output: Check out <a href="https://example.com" rel="nofollow">https://example.com</a> for more info.

Getting Started

To use Bleach in your Python project, follow these steps:

  1. Install Bleach using pip:

    pip install bleach
    
  2. Import and use Bleach in your Python code:

    import bleach
    
    html = "<p>Some <script>unsafe</script> content.</p>"
    clean_html = bleach.clean(html)
    print(clean_html)
    
  3. Customize allowed tags and attributes as needed:

    clean_html = bleach.clean(html, tags=['p', 'strong'], attributes={})
    

For more advanced usage and configuration options, refer to the official Bleach documentation.

Competitor Comparisons

13,621

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:

Pros of DOMPurify

  • Designed specifically for client-side sanitization in browsers
  • Faster performance due to DOM-based approach
  • More extensive configuration options for fine-tuned control

Cons of DOMPurify

  • Larger file size, which may impact page load times
  • Limited server-side support compared to Bleach

Code Comparison

Bleach:

import bleach

dirty_html = '<script>alert("XSS")</script><p>Safe content</p>'
clean_html = bleach.clean(dirty_html)

DOMPurify:

import DOMPurify from 'dompurify';

const dirty_html = '<script>alert("XSS")</script><p>Safe content</p>';
const clean_html = DOMPurify.sanitize(dirty_html);

Key Differences

  • Bleach is Python-based and primarily for server-side use, while DOMPurify is JavaScript-based for client-side sanitization
  • Bleach uses string manipulation, whereas DOMPurify leverages the browser's DOM parser
  • DOMPurify offers more granular control over allowed tags and attributes

Use Cases

  • Choose Bleach for server-side sanitization in Python applications
  • Opt for DOMPurify when working with client-side JavaScript and need browser-specific features

Both libraries effectively sanitize HTML and prevent XSS attacks, but their ideal use cases differ based on the development environment and specific project requirements.

5,195

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

Pros of js-xss

  • Specifically designed for JavaScript, making it more suitable for client-side XSS prevention
  • Offers flexible configuration options for customizing allowed tags and attributes
  • Provides a whitelist-based approach, which can be more secure in certain scenarios

Cons of js-xss

  • Less comprehensive documentation compared to Bleach
  • May require more manual configuration to achieve desired results
  • Not as widely adopted or battle-tested as Bleach

Code Comparison

Bleach (Python):

import bleach

html = '<script>alert("XSS")</script><p>Safe content</p>'
cleaned = bleach.clean(html, tags=['p'])
print(cleaned)  # Output: &lt;script&gt;alert("XSS")&lt;/script&gt;<p>Safe content</p>

js-xss (JavaScript):

const xss = require('xss');

const html = '<script>alert("XSS")</script><p>Safe content</p>';
const cleaned = xss(html);
console.log(cleaned);  // Output: &lt;script&gt;alert("XSS")&lt;/script&gt;<p>Safe content</p>

Both libraries effectively sanitize HTML input by removing potentially dangerous tags and attributes. Bleach is more suitable for server-side Python applications, while js-xss is tailored for JavaScript environments, particularly in the browser. The choice between them depends on the specific programming language and environment of your project.

Ruby HTML and CSS sanitizer.

Pros of Sanitize

  • More flexible configuration options for allowed elements and attributes
  • Supports custom filters for fine-grained control over sanitization
  • Actively maintained with regular updates and improvements

Cons of Sanitize

  • Slower performance compared to Bleach, especially for large inputs
  • Less extensive documentation and community support
  • Steeper learning curve for advanced usage

Code Comparison

Bleach:

import bleach

html = '<script>alert("XSS")</script><p>Safe content</p>'
clean_html = bleach.clean(html, tags=['p'], strip=True)

Sanitize:

require 'sanitize'

html = '<script>alert("XSS")</script><p>Safe content</p>'
clean_html = Sanitize.fragment(html, Sanitize::Config::BASIC)

Both libraries effectively remove the malicious script tag, but Sanitize offers more configuration options out of the box. Bleach requires additional setup for similar flexibility. Sanitize's default configurations (like BASIC) provide quick, pre-defined sanitization levels, while Bleach often requires manual specification of allowed tags and attributes.

1,133

Caja is a tool for safely embedding third party HTML, CSS and JavaScript in your website.

Pros of Caja

  • More comprehensive security model, including object-capability security
  • Supports JavaScript sandboxing in addition to HTML sanitization
  • Offers a wider range of features for content rewriting and transformation

Cons of Caja

  • More complex to set up and use compared to Bleach
  • Less actively maintained (last update in 2017)
  • Steeper learning curve due to its extensive feature set

Code Comparison

Bleach (HTML sanitization):

import bleach

html = '<script>alert("XSS")</script><p>Safe content</p>'
clean_html = bleach.clean(html)
print(clean_html)  # Output: &lt;script&gt;alert("XSS")&lt;/script&gt;<p>Safe content</p>

Caja (JavaScript sanitization):

var cajaServer = require('google-caja');
var html = '<script>alert("XSS")</script><p>Safe content</p>';
cajaServer.sanitize(html, function(sanitizedHtml) {
  console.log(sanitizedHtml);
});

Summary

Bleach is a simpler, more focused library for HTML sanitization, while Caja offers a broader set of security features including JavaScript sandboxing. Bleach is more actively maintained and easier to use, making it a better choice for straightforward HTML cleaning tasks. Caja, despite being more complex, provides a more comprehensive security solution for applications requiring advanced content rewriting and transformation capabilities.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

====== Bleach

.. image:: https://github.com/mozilla/bleach/workflows/Test/badge.svg :target: https://github.com/mozilla/bleach/actions?query=workflow%3ATest

.. image:: https://github.com/mozilla/bleach/workflows/Lint/badge.svg :target: https://github.com/mozilla/bleach/actions?query=workflow%3ALint

.. image:: https://badge.fury.io/py/bleach.svg :target: http://badge.fury.io/py/bleach

NOTE: 2023-01-23: Bleach is deprecated. See issue: <https://github.com/mozilla/bleach/issues/698>__

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes.

Bleach can also linkify text safely, applying filters that Django's urlize filter cannot, and optionally setting rel attributes, even on links already in the text.

Bleach is intended for sanitizing text from untrusted sources. If you find yourself jumping through hoops to allow your site administrators to do lots of things, you're probably outside the use cases. Either trust those users, or don't.

Because it relies on html5lib_, Bleach is as good as modern browsers at dealing with weird, quirky HTML fragments. And any of Bleach's methods will fix unbalanced or mis-nested tags.

The version on GitHub_ is the most up-to-date and contains the latest bug fixes. You can find full documentation on ReadTheDocs_.

:Code: https://github.com/mozilla/bleach :Documentation: https://bleach.readthedocs.io/ :Issue tracker: https://github.com/mozilla/bleach/issues :License: Apache License v2; see LICENSE file

Reporting Bugs

For regular bugs, please report them in our issue tracker <https://github.com/mozilla/bleach/issues>_.

If you believe that you've found a security vulnerability, please file a secure bug report in our bug tracker <https://bugzilla.mozilla.org/enter_bug.cgi?assigned_to=nobody%40mozilla.org&product=Webtools&component=Bleach-security&groups=webtools-security>_ or send an email to security AT mozilla DOT org.

For more information on security-related bug disclosure and the PGP key to use for sending encrypted mail or to verify responses received from that address, please read our wiki page at <https://www.mozilla.org/en-US/security/#For_Developers>_.

Security

Bleach is a security-focused library.

We have a responsible security vulnerability reporting process. Please use that if you're reporting a security issue.

Security issues are fixed in private. After we land such a fix, we'll do a release.

For every release, we mark security issues we've fixed in the CHANGES in the Security issues section. We include any relevant CVE links.

Installing Bleach

Bleach is available on PyPI_, so you can install it with pip::

$ pip install bleach

Upgrading Bleach

.. warning::

Before doing any upgrades, read through Bleach Changes <https://bleach.readthedocs.io/en/latest/changes.html>_ for backwards incompatible changes, newer versions, etc.

Bleach follows semver 2_ versioning. Vendored libraries will not be changed in patch releases.

Basic use

The simplest way to use Bleach is:

.. code-block:: python

>>> import bleach

>>> bleach.clean('an <script>evil()</script> example')
u'an &lt;script&gt;evil()&lt;/script&gt; example'

>>> bleach.linkify('an http://example.com url')
u'an <a href="http://example.com" rel="nofollow">http://example.com</a> url'

Code of Conduct

This project and repository is governed by Mozilla's code of conduct and etiquette guidelines. For more details please see the CODE_OF_CONDUCT.md </CODE_OF_CONDUCT.md>_

.. _html5lib: https://github.com/html5lib/html5lib-python .. _GitHub: https://github.com/mozilla/bleach .. _ReadTheDocs: https://bleach.readthedocs.io/ .. _PyPI: https://pypi.org/project/bleach/ .. _semver 2: https://semver.org/