autocorrect

A linter and formatter to help you to improve copywriting, correct spaces, words, and punctuations between CJK (Chinese, Japanese, Korean).

1,420

View on GitHub View on NPM

Top Related Projects

fuzzywuzzy

9,254

Fuzzy String Matching in Python

Quick Overview

The autocorrect project is a Ruby library that provides automatic spelling correction for text input. It uses a combination of algorithms, including the Levenshtein distance and n-gram analysis, to suggest the most likely correct spelling for misspelled words.

Pros

Accurate Spelling Correction: The library is designed to provide accurate spelling suggestions, even for complex or uncommon words.
Customizable Dictionaries: Users can easily customize the dictionary used for spell-checking, allowing for domain-specific or language-specific corrections.
Efficient Performance: The library is optimized for performance, making it suitable for use in real-time applications.
Active Development: The project is actively maintained, with regular updates and improvements.

Cons

Limited Language Support: The library currently only supports the English language, limiting its usefulness for users who require spell-checking in other languages.
Dependency on External Dictionaries: The library relies on external dictionaries, which may not always be up-to-date or comprehensive.
Potential for False Positives: In some cases, the library may suggest incorrect spelling corrections, especially for words that are not in the dictionary.
Limited Contextual Awareness: The library does not currently take into account the context of the misspelled word, which could improve the accuracy of the suggestions.

Code Examples

Here are a few examples of how to use the autocorrect library in Ruby:

# Basic usage
require 'autocorrect'
Autocorrect.correct("teh") # => "the"

# Customizing the dictionary
Autocorrect.dictionary = ["custom", "words", "here"]
Autocorrect.correct("custum") # => "custom"

# Checking multiple words
Autocorrect.correct_text("I wnat to go to the store") # => "I want to go to the store"

# Handling out-of-vocabulary words
Autocorrect.correct("zxqwer") # => "zxqwer"

Getting Started

To get started with the autocorrect library, follow these steps:

Add the autocorrect gem to your Gemfile:
```
gem 'autocorrect'
```
Install the gem by running the following command in your terminal:
```
bundle install
```

In your Ruby code, require the autocorrect library and start using it:

require 'autocorrect'

# Correct a single word
Autocorrect.correct("teh") # => "the"

# Correct a sentence
Autocorrect.correct_text("I wnat to go to the store") # => "I want to go to the store"

# Customize the dictionary
Autocorrect.dictionary = ["custom", "words", "here"]
Autocorrect.correct("custum") # => "custom"

Refer to the project's documentation for more advanced usage and configuration options.

Competitor Comparisons

fuzzywuzzy

9,254

Fuzzy String Matching in Python

Pros of FuzzyWuzzy

FuzzyWuzzy provides a more comprehensive set of string matching algorithms, including Levenshtein distance, Jaro-Winkler distance, and Partial Ratio, among others.
FuzzyWuzzy has a larger user base and more active development, with more contributors and a higher number of stars on GitHub.
FuzzyWuzzy is more flexible, allowing users to customize the matching algorithms and thresholds to suit their specific needs.

Cons of FuzzyWuzzy

Autocorrect is more lightweight and may be more suitable for simpler use cases where a smaller library is preferred.
Autocorrect has a more focused feature set, which may be easier to understand and use for some users.
Autocorrect may have better performance for certain types of string matching tasks, depending on the specific requirements of the project.

Code Comparison

Autocorrect:

from autocorrect import spell

text = "I wnat to go to the park."
corrected_text = spell(text)
print(corrected_text)  # Output: "I want to go to the park."

FuzzyWuzzy:

from fuzzywuzzy import fuzz

text1 = "I wnat to go to the park."
text2 = "I want to go to the park."
ratio = fuzz.ratio(text1, text2)
print(ratio)  # Output: 96

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

AutoCorrect Icon

AutoCorrect

ð¯ AutoCorrect çæ¿æ¯æ¯æä¾ä¸å¥æ ååçææ¡æ ¡æ£æ¹æ¡ãä»¥ä¾¿äºå¨åç±»åºæ¯ï¼ä¾å¦ï¼æ°åä¹¦ç±ãææ¡£ãåå®¹åå¸ãé¡¹ç®æºä»£ç ...ï¼éé¢åºç¨ï¼è®©ä½¿ç¨èè½»æ¾å®ç°æ ååãä¸ä¸åçææ¡è¾åº / æ ¡æ£ã

AutoCorrect is a linter and formatter to help you to improve copywriting, correct spaces, words, and punctuations between CJK (Chinese, Japanese, Korean).

Like Eslint, Rubocop and Gofmt ..., AutoCorrect allows us to check source code, and output as colorized diff with corrected suggestions. You can integrate to CI (GitLab CI, GitHub Action, Travis CI....) for use to check the contents in source code. Recognize the file name, and find out the strings and the comment part.

æ¤æ¹æ¡ææ©äº 2013 å¹´ åºç°äº Ruby China çé¡¹ç®ï¼å¹¶éæ¥å®åè§åç»èï¼å½ååç¡®çè¾é«ï¼æå°æ°å¼å¸¸æåµï¼ï¼ä½ å¯ä»¥æ¾å¿ç¨æ¥è¾å©ä½ å®æèªå¨çº æ£å¨ä½ã

Features

Add spacing between CJK (Chinese, Japanese, Korean) and English words.
Correct punctuations into full-width near the CJK.
Correct punctuations into half-width in English content.
(Experimental) Spellcheck and correct words with your dictionary.
Lint checking and output diff or JSON result, so you can integrate everywhere (GitLab CI, GitHub Action, VS Code, Vim, Emacs...)
Allows using .gitignore or .autocorrectignore to ignore files that you want to ignore.
Support more than 28 file types (Markdown, JSON, YAML, JavaScript, HTML ...), use AST parser to only check for strings, and comments.
LSP server: autocorrect-lsp
Cross-platform for Linux, macOS, Windows, and WebAssembly, and as Native SDK for programming (Node.js, JavaScript Browser, Ruby, Python, Java).

å¸ååºç¨åºæ¯

æ°åä¹¦ç±ãææ¡£ï¼æ°é»åªä½çåå®¹åå¸ï¼åºç¨äº MarkdownãAsciiDocãHTML çææ¡£åºæ¯ï¼ç¡®ä¿ææ¡çæ ååãä¸ä¸åï¼æ¡ä¾ï¼MDN é¡¹ç®ãå°æ°æ´¾ï¼ã
éæ GitLab CIãGitHub ActionãTravis CI ç CI ç¯å¢ï¼éè¦å¯¹é¡¹ç®è¿è¡èªå¨åæ£æ¥ã
éæå° DocusaurusãHexoãHugoãJekyllãGatsby çéæç½ç«çæå¨ï¼å¨çæçæ¶åèªå¨æ ¼å¼åã
å©ç¨è¯è¨æ¯æç SDK éæå°åºç¨ç¨åºï¼å¨åå¨æè¾åºç½ç«åå®¹çæ¶åæ ¼å¼åï¼æåç½ç«åè´¨ï¼å¦ï¼Ruby ChinaãV2EXãLongbridgeï¼ã
ä½ä¸º VS CodeãIntellij Platform IDEï¼å·²æ¯æï¼ãVimãEmacs (å¾å®ç°) æä»¶ï¼éè¦å¯¹ææ¡è¿è¡æ£æ¥ï¼Linter & Formatterï¼ï¼ä¾é LintResult ç»åºçï¼AnnotatorãDiagnosticï¼æç¤ºã
åºäº WebAssembly å®ç°ï¼ä½ä¸º ChromeãSafari çæµè§å¨æä»¶ï¼åºç¨äºä»»ä½ç½ç«ï¼å¾å®ç°ï¼
ä¹å¯ä»¥éæå° WYSIWYG Editor éé¢ï¼ä¾å¦ï¼ProseMirrorãCKEditorãSlateãDraft.jsãTiptapãMonaco EditorãCodeMirror çï¼ã

Installation

Install on macOS

You can install it via Homebrew:

$ brew install autocorrect

Install on Windows

You can install it via Scoop:

$ scoop install autocorrect

Or you can just install it via this on Unix-like system:

$ curl -sSL https://git.io/JcGER | sh

After that, you will get autocorrect command.

$ autocorrect -V
AutoCorrect 2.4.0

Or install NPM:

$ yarn add autocorrect-node
$ yarn autocorrect -V

Upgrade

Since: 1.9.0

AutoCorrect allows you to upgrade itself by autocorrect update command.

$ autocorrect update

NOTE: This command need you input your password, because it will install bin into /usr/local/bin directory.

Use in CLI

$ autocorrect text.txt
ä½ å¥½ Hello ä¸ç

$ echo "helloä¸ç" | autocorrect --stdin
hello ä¸ç

$ autocorrect --fix text.txt
$ autocorrect --fix zh-CN.yml
$ autocorrect --fix

Lint

$ autocorrect --lint --format json text.txt

$ autocorrect --lint text.txt

Error: 1, Warning: 0

text.txt:1:3
-ä½ å¥½Helloä¸ç
+ä½ å¥½ Hello ä¸ç

You also can lint multiple files:

$ autocorrect --lint

How to lint all changed files in Git:

$ git diff --name-only | xargs autocorrect --lint

Use in NPM

since: 2.7.0

AutoCorrect has been published in NPM with CLI command support. If you want to use it in Frontend or Node.js project, you can just install autocorrect-node package for without install AutoCorrect bin.

cd your-project
yarn add autocorrect-node

Now you can run yarn autocorrect command in your project. This command is same as autocorrect command.

$ yarn autocorrect -h

More docs: autocorrect-node/README.md

Configuration

Default config: .autocorrect.default

$ autocorrect init
AutoCorrect init config: .autocorrectrc

NOTE: If you download fail, try to use autocorrect init --local command again.

Now the .autocorrectrc file has been created.

.autocorrectrc is allows use YAML, JSON format.

Config file example:

# yaml-language-server: $schema=https://huacnlee.github.io/autocorrect/schema.json
# Config rules
rules:
  # Auto add spacing between CJK (Chinese, Japanese, Korean) and English words.
  # 0 - off, 1 - error, 2 - warning
  space-word: 1
  # Add space between some punctuations.
  space-punctuation: 1
  # Add space between brackets (), [] when near the CJK.
  space-bracket: 1
  # Add space between ``, when near the CJK.
  space-backticks: 1
  # Add space between dash `-`
  space-dash: 0
  # Add space between dollar $ when near the CJK.
  space-dollar: 0
  # Convert to fullwidth.
  fullwidth: 1
  # To remove space near the fullwidth.
  no-space-fullwidth: 1
  # Fullwidth alphanumeric characters to halfwidth.
  halfwidth-word: 1
  # Fullwidth punctuations to halfwidth in english.
  halfwidth-punctuation: 1
  # Spellcheck
  spellcheck: 2
# Enable or disable in a specific context
context:
  # Enable or disable to format codeblock in Markdown or AsciiDoc etc.
  codeblock: 1
textRules:
  # Config special rules for some texts
  # For example, if we wants to let "Helloä½ å¥½" just warning, and "Hiä½ å¥½" to ignore
  # "Helloä½ å¥½": 2
  # "Hiä½ å¥½": 0
fileTypes:
  # Config the files associations, you config is higher priority than default.
  # "rb": ruby
  # "Rakefile": ruby
  # "*.js": javascript
  # ".mdx": markdown
spellcheck:
  # Correct Words (Case insensitive) for by Spellcheck
  words:
    - GitHub
    - App Store
    # This means "appstore" into "App Store"
    - AppStore = App Store
    - Git
    - Node.js
    - nodejs = Node.js
    - VIM
    - DNS
    - HTTP
    - SSL

Ignore option

Since: 2.2.0

When you want to config some special words or texts to ignore on format or lint.

The textRules config may help you.

For example, we want:

Helloä¸ç - To just give a warning.
Hiä½ å¥½ - To ignore.

Use can config:

textRules:
  Helloä¸ç: 2
  Hiä½ å¥½: 0

After that, AutoCorrect will follow your textRules to process.

Ignore files

Use .autocorrectignore to ignore files

Sometimes, you may want to ignore some special files that not want to check.

By default, the file matched .gitignore rule will be ignored.

You can also use .autocorrectignore to ignore other files, format like .gitignore.

Disable by inline comment

If you just want to disable some special lines in a file, you can write a comment autocorrect-disable, when AutoCorrect matched the comment include that, it will disable temporarily.

And then, you can use autocorrect-enable to reopen it again.

For example, in JavaScript:

function hello() {
  // autocorrect-disable
  console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨");
  console.log("è¿è¡ä¹æ¯disableçç¶æ");
  // autocorrect-enable
  let a = "ç°å¨èµ·autocorrectåå°äºå¯ç¨çç¶æ";
}

The output will:

function hello() {
  // autocorrect-disable
  console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨");
  console.log("è¿è¡ä¹æ¯disableçç¶æ");
  // autocorrect-enable
  let a = "ç°å¨èµ· autocorrect åå°äºå¯ç¨çç¶æ";
}

Disable some rules

Since: 2.0

You can use autocorrect-disable <rule> in a comment to disable some rules.

Rule names please see: Configuration

function hello() {
  // autocorrect-disable space-word
  console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨.");
  // autocorrect-disable fullwidth
  console.log("è¿è¡ä¹æ¯disableçç¶æ.");
  // autocorrect-enable
  let a = "ç°å¨èµ·autocorrectåå°äºå¯ç¨çç¶æ.";
}

Will get:

function hello() {
  // autocorrect-disable space-word
  console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨ã");
  // autocorrect-disable fullwidth, space-word
  console.log("è¿è¡ä¹æ¯disableçç¶æ.");
  // autocorrect-enable
  let a = "ç°å¨èµ· autocorrect åå°äºå¯ç¨çç¶æã";
}

VS Code Extension

https://marketplace.visualstudio.com/items?itemName=huacnlee.autocorrect

Screenshot:

Intellij Platform Plugin

https://github.com/huacnlee/autocorrect-idea-plugin

GitHub Action

https://github.com/huacnlee/autocorrect-action

Add to your .github/workflows/ci.yml

steps:
  - name: Check source code
    uses: actions/checkout@v4

  - name: AutoCorrect
    uses: huacnlee/autocorrect-action@main

GitLab CI

Add to your .gitlab-ci.yml, to use huacnlee/autocorrect Docker image to check.

autocorrect:
  stage: build
  image: huacnlee/autocorrect:latest
  script:
    - autocorrect --lint
  # Enable allow_failure if you wants.
  # allow_failure: true

Work with ReviewDog

Since: 2.8.0

AutoCorrect can work with reviewdog, so you can use it in CI/CD. ReviewDog will post a comment to your PR with the AutoCorrect change suggestions. Then the PR committer can easy to accept the suggestions.

Use --format rdjson option to output the lint results as the reviewdog supported format.

autocorrect --lint --format rdjson | reviewdog -f=rdjson -reporter=github-pr-review

Use huacnlee/autocorrect-action can help you setup GitHub Action.

Use for programming

AutoCorrect makes for support use in many programming languages.

Rust - autocorrect
Ruby - autocorrect-rb
Go - autocorrect-go
Python - autocorrect-py
Node.js - autocorrect-node
JavaScript (Browser) - autocorrect-wasm
Java - autocorrect-java

Benchmark

MacBook Pro (13-inch, Apple M3, 2023)

Use make bench to run benchmark tests.

See autocorrect/src/benches/example.rs for details.

format_050              time:   [4.9991 Âµs 5.0175 Âµs 5.0382 Âµs]
format_100              time:   [8.7714 Âµs 8.8236 Âµs 8.8896 Âµs]
format_400              time:   [23.535 Âµs 23.591 Âµs 23.666 Âµs]
format_html             time:   [332.87 Âµs 334.00 Âµs 335.37 Âµs]
halfwidth_english       time:   [1.2051 Âµs 1.2079 Âµs 1.2110 Âµs]
format_json             time:   [54.019 Âµs 54.345 Âµs 54.855 Âµs]
format_javascript       time:   [176.61 Âµs 181.64 Âµs 187.20 Âµs]
format_json_2k          time:   [9.3245 ms 9.3768 ms 9.4390 ms]
format_jupyter          time:   [200.77 Âµs 204.93 Âµs 210.91 Âµs]
format_markdown         time:   [1.2216 ms 1.2246 ms 1.2283 ms]

spellcheck_50           time:   [1.2098 Âµs 1.2162 Âµs 1.2234 Âµs]
spellcheck_100          time:   [2.2592 Âµs 2.3049 Âµs 2.3861 Âµs]
spellcheck_400          time:   [7.7480 Âµs 7.9111 Âµs 8.1764 Âµs]

lint_markdown           time:   [1.2704 ms 1.2883 ms 1.3173 ms]
lint_json               time:   [58.696 Âµs 60.847 Âµs 63.484 Âµs]
lint_html               time:   [448.53 Âµs 486.95 Âµs 534.01 Âµs]
lint_javascript         time:   [177.00 Âµs 177.88 Âµs 178.69 Âµs]
lint_yaml               time:   [378.35 Âµs 382.30 Âµs 387.85 Âµs]
lint_to_json            time:   [1.2629 ms 1.2689 ms 1.2769 ms]
lint_to_diff            time:   [1.3255 ms 1.3288 ms 1.3327 ms]

Real world benchmark

With MDN Translated Content project, it has about 30K files.

~/work/translated-content $ autocorrect --fix
AutoCorrect spend time: 8402.538ms

Other Extensions

The other implementations from the community.

User cases

License

This project under MIT license.

Top Related Projects

fuzzywuzzy

9,254

Fuzzy String Matching in Python

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

autocorrect

Top Related Projects

fuzzywuzzy

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

fuzzywuzzy

Pros of FuzzyWuzzy

Cons of FuzzyWuzzy

Code Comparison

Convert designs to code with AI

README

AutoCorrect

Features

å ¸ååºç¨åºæ¯

Installation

Upgrade

Usage

Use in CLI

Lint

Use in NPM

Configuration

Ignore option

Ignore files

Disable by inline comment

Disable some rules

VS Code Extension

Intellij Platform Plugin

GitHub Action

GitLab CI

Work with ReviewDog

Use for programming

Benchmark

Real world benchmark

Other Extensions

User cases

License

Top Related Projects

fuzzywuzzy

Convert designs to code with AI

NPM DownloadsLast 30 Days

å¸ååºç¨åºæ¯