Convert Figma logo to code with AI

Glench logofuzzyset.js

fuzzyset.js - A fuzzy string set for javascript

1,368
104
1,368
1

Top Related Projects

Fuzzy String Matching in Python

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

2,754

Fuzzy String Matching in Python

Quick Overview

FuzzySet.js is a fuzzy string matching library for JavaScript. It provides a way to perform approximate string matching, allowing you to find the closest match to a given string within a set of strings. This is particularly useful for tasks like autocomplete, spell checking, or finding similar items in a dataset.

Pros

  • Easy to use with a simple API
  • Supports both browser and Node.js environments
  • Customizable matching threshold and gram size
  • Lightweight with no external dependencies

Cons

  • Limited to exact substring matching, not semantic similarity
  • Performance may degrade with large datasets
  • Not actively maintained (last update was in 2019)
  • Limited documentation and examples

Code Examples

  1. Creating a FuzzySet and adding items:
const FuzzySet = require('fuzzyset.js');
const set = FuzzySet(['apple', 'banana', 'orange']);
  1. Finding the closest match:
const result = set.get('aple');
console.log(result); // [[0.8, 'apple']]
  1. Adding items dynamically and adjusting the threshold:
set.add('grape');
set.add('pineapple');
const result = set.get('grap', null, 0.7);
console.log(result); // [[0.75, 'grape']]

Getting Started

To use FuzzySet.js in your project, follow these steps:

  1. Install the package:

    npm install fuzzyset.js
    
  2. Import and use in your JavaScript code:

    const FuzzySet = require('fuzzyset.js');
    
    const set = FuzzySet(['hello', 'world', 'fuzzy', 'matching']);
    
    const result = set.get('helo');
    console.log(result); // [[0.75, 'hello']]
    
  3. For browser usage, include the script in your HTML:

    <script src="https://cdnjs.cloudflare.com/ajax/libs/fuzzyset.js/0.0.91/fuzzyset.min.js"></script>
    

Competitor Comparisons

Fuzzy String Matching in Python

Pros of fuzzywuzzy

  • More comprehensive set of string matching algorithms, including Levenshtein distance and token-based matching
  • Better support for handling non-ASCII characters and Unicode strings
  • Includes built-in functions for extracting best matches from a list of choices

Cons of fuzzywuzzy

  • Generally slower performance compared to fuzzyset.js, especially for large datasets
  • Requires additional dependencies (Python's difflib) for some functionalities
  • Less suitable for browser-based applications due to its Python implementation

Code Comparison

fuzzywuzzy:

from fuzzywuzzy import fuzz
ratio = fuzz.ratio("this is a test", "this is a test!")

fuzzyset.js:

const FuzzySet = require('fuzzyset.js');
const a = FuzzySet(['this is a test']);
const result = a.get('this is a test!');

Both libraries provide simple interfaces for fuzzy string matching, but fuzzywuzzy offers more built-in algorithms and options for customization. fuzzyset.js is more lightweight and better suited for JavaScript environments, while fuzzywuzzy provides a broader range of functionalities at the cost of performance and language limitations.

The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Pros of python-Levenshtein

  • Implemented in C, offering superior performance for large-scale operations
  • Provides a wider range of string similarity algorithms beyond Levenshtein distance
  • Supports Unicode strings natively

Cons of python-Levenshtein

  • Limited to Python environment, not suitable for JavaScript projects
  • Requires compilation, which may be challenging on some systems
  • Less intuitive API for simple fuzzy matching tasks

Code Comparison

python-Levenshtein:

from Levenshtein import distance
result = distance("kitten", "sitting")

fuzzyset.js:

const FuzzySet = require('fuzzyset.js');
const a = FuzzySet(['kitten']);
const result = a.get('sitting');

python-Levenshtein provides a more direct approach to calculating string distances, while fuzzyset.js offers a higher-level API for fuzzy matching. The python-Levenshtein example calculates the Levenshtein distance between two strings, whereas fuzzyset.js creates a set of strings and performs a fuzzy search against it.

Both libraries serve different purposes and environments. python-Levenshtein is better suited for high-performance, low-level string operations in Python, while fuzzyset.js provides an easy-to-use fuzzy matching solution for JavaScript applications.

2,754

Fuzzy String Matching in Python

Pros of thefuzz

  • More comprehensive set of string matching algorithms, including Levenshtein, Jaro-Winkler, and Q-gram
  • Better performance for large datasets due to optimized C implementations
  • Active development and maintenance with regular updates

Cons of thefuzz

  • Larger library size, which may impact load times in browser environments
  • Slightly more complex API, requiring more setup for basic use cases
  • Python-based, which may not be ideal for JavaScript-centric projects

Code Comparison

thefuzz:

from thefuzz import fuzz
ratio = fuzz.ratio("this is a test", "this is a test!")

fuzzyset.js:

const FuzzySet = require('fuzzyset.js');
const a = FuzzySet(['this is a test']);
const result = a.get('this is a test!');

Both libraries provide fuzzy string matching capabilities, but thefuzz offers a wider range of algorithms and is better suited for large-scale applications. fuzzyset.js, on the other hand, is more lightweight and easier to integrate into JavaScript projects. The choice between the two depends on the specific requirements of your project, such as the programming language, performance needs, and the complexity of string matching tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Fuzzyset - A fuzzy string set for javascript

Fuzzyset is a data structure that performs something akin to fulltext search against data to determine likely mispellings and approximate string matching.

Usage

The usage is simple. Just add a string to the set, and ask for it later by using .get:

   a = FuzzySet();
   a.add("michael axiak");
   a.get("micael asiak");
   // will be [[0.8461538461538461, 'michael axiak']];

The result will be an array of [score, matched_value] arrays. The score is between 0 and 1, with 1 being a perfect match.

Install

npm install fuzzyset

(Used to be fuzzyset.js.)

Then:

import FuzzySet from 'fuzzyset'

// or, depending on your JavaScript environment...

const FuzzySet = require('fuzzyset')

Or for use directly on the web:

<script type="text/javascript" src="dist/fuzzyset.js"></script>

This library should work just fine with TypeScript, too.

Construction Arguments

  • array: An array of strings to initialize the data structure with
  • useLevenshtein: Whether or not to use the levenshtein distance to determine the match scoring. Default: true
  • gramSizeLower: The lower bound of gram sizes to use, inclusive (see interactive documentation). Default: 2
  • gramSizeUpper: The upper bound of gram sizes to use, inclusive (see interactive documentation). Default: 3

Methods

  • get(value, [default], [minScore=.33]): try to match a string to entries with a score of at least minScore (defaulted to .33), otherwise return null or default if it is given.
  • add(value): add a value to the set returning false if it is already in the set.
  • length(): return the number of items in the set.
  • isEmpty(): returns true if the set is empty.
  • values(): returns an array of the values in the set.

Interactive Documentation

To play with the library or see how it works internally, check out the amazing interactive documentation:

Interactive documentation screenshot

Develop

To contribute to the library, edit the lib/fuzzyset.js file then run npm run build to generate all the different file formats in the dist/ directory. Or run npm run dev while developing to auto-build as you change files.

License

This package is licensed under the Prosperity Public License 3.0.

That means that this package is free to use for non-commercial projects — personal projects, public benefit projects, research, education, etc. (see the license for full details). If your project is commercial (even for internal use at your company), you have 30 days to try this package for free before you have to pay a one-time licensing fee of $42.

You can purchase a commercial license instantly here.

Why this license scheme? Since I quit tech to become a therapist, my income is much lower (due to the unjust costs of mental health care in the US, but don't get me started). I'm asking for paid licenses for Fuzzyset.js to support all the free work I've done on this project over the past 10 years (!) and so I can live a sustainable life in service of my therapy clients. If you're a small operation that would like to use Fuzzyset.js but can't swing the license cost, please reach out to me and we can work something out.

NPM DownloadsLast 30 Days