autocorrect
A linter and formatter to help you to improve copywriting, correct spaces, words, and punctuations between CJK (Chinese, Japanese, Korean).
Top Related Projects
Fuzzy String Matching in Python
Quick Overview
The autocorrect
project is a Ruby library that provides automatic spelling correction for text input. It uses a combination of algorithms, including the Levenshtein distance and n-gram analysis, to suggest the most likely correct spelling for misspelled words.
Pros
- Accurate Spelling Correction: The library is designed to provide accurate spelling suggestions, even for complex or uncommon words.
- Customizable Dictionaries: Users can easily customize the dictionary used for spell-checking, allowing for domain-specific or language-specific corrections.
- Efficient Performance: The library is optimized for performance, making it suitable for use in real-time applications.
- Active Development: The project is actively maintained, with regular updates and improvements.
Cons
- Limited Language Support: The library currently only supports the English language, limiting its usefulness for users who require spell-checking in other languages.
- Dependency on External Dictionaries: The library relies on external dictionaries, which may not always be up-to-date or comprehensive.
- Potential for False Positives: In some cases, the library may suggest incorrect spelling corrections, especially for words that are not in the dictionary.
- Limited Contextual Awareness: The library does not currently take into account the context of the misspelled word, which could improve the accuracy of the suggestions.
Code Examples
Here are a few examples of how to use the autocorrect
library in Ruby:
# Basic usage
require 'autocorrect'
Autocorrect.correct("teh") # => "the"
# Customizing the dictionary
Autocorrect.dictionary = ["custom", "words", "here"]
Autocorrect.correct("custum") # => "custom"
# Checking multiple words
Autocorrect.correct_text("I wnat to go to the store") # => "I want to go to the store"
# Handling out-of-vocabulary words
Autocorrect.correct("zxqwer") # => "zxqwer"
Getting Started
To get started with the autocorrect
library, follow these steps:
-
Add the
autocorrect
gem to your Gemfile:gem 'autocorrect'
-
Install the gem by running the following command in your terminal:
bundle install
-
In your Ruby code, require the
autocorrect
library and start using it:require 'autocorrect' # Correct a single word Autocorrect.correct("teh") # => "the" # Correct a sentence Autocorrect.correct_text("I wnat to go to the store") # => "I want to go to the store" # Customize the dictionary Autocorrect.dictionary = ["custom", "words", "here"] Autocorrect.correct("custum") # => "custom"
-
Refer to the project's documentation for more advanced usage and configuration options.
Competitor Comparisons
Fuzzy String Matching in Python
Pros of FuzzyWuzzy
- FuzzyWuzzy provides a more comprehensive set of string matching algorithms, including Levenshtein distance, Jaro-Winkler distance, and Partial Ratio, among others.
- FuzzyWuzzy has a larger user base and more active development, with more contributors and a higher number of stars on GitHub.
- FuzzyWuzzy is more flexible, allowing users to customize the matching algorithms and thresholds to suit their specific needs.
Cons of FuzzyWuzzy
- Autocorrect is more lightweight and may be more suitable for simpler use cases where a smaller library is preferred.
- Autocorrect has a more focused feature set, which may be easier to understand and use for some users.
- Autocorrect may have better performance for certain types of string matching tasks, depending on the specific requirements of the project.
Code Comparison
Autocorrect:
from autocorrect import spell
text = "I wnat to go to the park."
corrected_text = spell(text)
print(corrected_text) # Output: "I want to go to the park."
FuzzyWuzzy:
from fuzzywuzzy import fuzz
text1 = "I wnat to go to the park."
text2 = "I want to go to the park."
ratio = fuzz.ratio(text1, text2)
print(ratio) # Output: 96
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
AutoCorrect
ð¯ AutoCorrect çæ¿æ¯æ¯æä¾ä¸å¥æ ååçææ¡æ ¡æ£æ¹æ¡ã以便äºå¨åç±»åºæ¯ï¼ä¾å¦ï¼æ°å书ç±ãææ¡£ãå 容åå¸ã项ç®æºä»£ç ...ï¼éé¢åºç¨ï¼è®©ä½¿ç¨è è½»æ¾å®ç°æ ååãä¸ä¸åçææ¡è¾åº / æ ¡æ£ã
AutoCorrect is a linter and formatter to help you to improve copywriting, correct spaces, words, and punctuations between CJK (Chinese, Japanese, Korean).
Like Eslint, Rubocop and Gofmt ..., AutoCorrect allows us to check source code, and output as colorized diff with corrected suggestions. You can integrate to CI (GitLab CI, GitHub Action, Travis CI....) for use to check the contents in source code. Recognize the file name, and find out the strings and the comment part.
AutoCorrect æ¯ä¸ä¸ªåºäº Rust ç¼åçå·¥å ·ï¼ç¨äºãèªå¨çº æ£ãæãæ£æ¥å¹¶å»ºè®®ãææ¡ï¼ç» CJKï¼ä¸æãæ¥è¯ãé©è¯ï¼ä¸è±ææ··åçåºæ¯ï¼è¡¥å æ£ç¡®çç©ºæ ¼ï¼çº æ£åè¯ï¼åæ¶å°è¯ä»¥å®å ¨çæ¹å¼èªå¨çº æ£æ ç¹ç¬¦å·ççã
类似 ESlintãRubocopãGofmt çå·¥å ·ï¼AutoCorrect å¯ä»¥ç¨äº CI ç¯å¢ï¼å®æä¾ Lint åè½ï¼è½ä¾¿æ·çæ£æµåºé¡¹ç®ä¸æé®é¢çææ¡ï¼èµ·å°ç»ä¸è§èçä½ç¨ã
æ¯æåç§ç±»åæºä»£ç æ件ï¼è½èªå¨è¯å«æ件åï¼å¹¶åç¡®æ¾å°å符串ã注éåèªå¨çº æ£ã
æ¤æ¹æ¡ææ©äº 2013 å¹´ åºç°äº Ruby China ç项ç®ï¼å¹¶éæ¥å®åè§åç»èï¼å½ååç¡®çè¾é«ï¼æå°æ°å¼å¸¸æ åµï¼ï¼ä½ å¯ä»¥æ¾å¿ç¨æ¥è¾ å©ä½ å®æèªå¨çº æ£å¨ä½ã

Features
- Add spacing between CJK (Chinese, Japanese, Korean) and English words.
- Correct punctuations into full-width near the CJK.
- Correct punctuations into half-width in English content.
- (Experimental) Spellcheck and correct words with your dictionary.
- Lint checking and output diff or JSON result, so you can integrate everywhere (GitLab CI, GitHub Action, VS Code, Vim, Emacs...)
- Allows using
.gitignore
or.autocorrectignore
to ignore files that you want to ignore. - Support more than 28 file types (Markdown, JSON, YAML, JavaScript, HTML ...), use AST parser to only check for strings, and comments.
- LSP server: autocorrect-lsp
- Cross-platform for Linux, macOS, Windows, and WebAssembly, and as Native SDK for programming (Node.js, JavaScript Browser, Ruby, Python, Java).
å ¸ååºç¨åºæ¯
- æ°å书ç±ãææ¡£ï¼æ°é»åªä½çå 容åå¸ï¼åºç¨äº MarkdownãAsciiDocãHTML çææ¡£åºæ¯ï¼ç¡®ä¿ææ¡çæ ååãä¸ä¸åï¼æ¡ä¾ï¼MDN 项ç®ãå°æ°æ´¾ï¼ã
- éæ GitLab CIãGitHub ActionãTravis CI ç CI ç¯å¢ï¼éè¦å¯¹é¡¹ç®è¿è¡èªå¨åæ£æ¥ã
- éæå° DocusaurusãHexoãHugoãJekyllãGatsby çéæç½ç«çæå¨ï¼å¨çæçæ¶åèªå¨æ ¼å¼åã
- å©ç¨è¯è¨æ¯æç SDK éæå°åºç¨ç¨åºï¼å¨åå¨æè¾åºç½ç«å 容çæ¶åæ ¼å¼åï¼æåç½ç«åè´¨ï¼å¦ï¼Ruby ChinaãV2EXãLongbridgeï¼ã
- ä½ä¸º VS CodeãIntellij Platform IDEï¼å·²æ¯æï¼ãVimãEmacs (å¾ å®ç°) æ件ï¼éè¦å¯¹ææ¡è¿è¡æ£æ¥ï¼Linter & Formatterï¼ï¼ä¾é LintResult ç»åºçï¼AnnotatorãDiagnosticï¼æ示ã
- åºäº WebAssembly å®ç°ï¼ä½ä¸º ChromeãSafari çæµè§å¨æ件ï¼åºç¨äºä»»ä½ç½ç«ï¼å¾ å®ç°ï¼
- ä¹å¯ä»¥éæå° WYSIWYG Editor éé¢ï¼ä¾å¦ï¼ProseMirrorãCKEditorãSlateãDraft.jsãTiptapãMonaco EditorãCodeMirror çï¼ã
Installation
Or you can just install it via this on Unix-like system:
$ curl -sSL https://git.io/JcGER | sh
After that, you will get autocorrect
command.
$ autocorrect -V
AutoCorrect 2.4.0
Or install NPM:
$ yarn add autocorrect-node
$ yarn autocorrect -V
Upgrade
Since: 1.9.0
AutoCorrect allows you to upgrade itself by autocorrect update
command.
$ autocorrect update
NOTE: This command need you input your password, because it will install bin into
/usr/local/bin
directory.
Usage
- Use in CLI
- Use in NPM
- Configuration
- VS Code Extension
- Zed extension
- Intellij Platform Plugin
- GitHub Action
- GitLab CI
- Work with ReviewDog
- Use for programming
Use in CLI
$ autocorrect text.txt
ä½ å¥½ Hello ä¸ç
$ echo "helloä¸ç" | autocorrect --stdin
hello ä¸ç
$ autocorrect --fix text.txt
$ autocorrect --fix zh-CN.yml
$ autocorrect --fix
Lint
$ autocorrect --lint --format json text.txt
$ autocorrect --lint text.txt
Error: 1, Warning: 0
text.txt:1:3
-ä½ å¥½Helloä¸ç
+ä½ å¥½ Hello ä¸ç
You also can lint multiple files:
$ autocorrect --lint
How to lint all changed files in Git:
$ git diff --name-only | xargs autocorrect --lint
Use in NPM
since: 2.7.0
AutoCorrect has been published in NPM with CLI command support. If you want to use it in Frontend or Node.js project, you can just install autocorrect-node
package for without install AutoCorrect bin.
cd your-project
yarn add autocorrect-node
Now you can run yarn autocorrect
command in your project. This command is same as autocorrect
command.
$ yarn autocorrect -h
More docs: autocorrect-node/README.md
Configuration
Default config: .autocorrect.default
$ autocorrect init
AutoCorrect init config: .autocorrectrc
NOTE: If you download fail, try to use
autocorrect init --local
command again.
Now the .autocorrectrc
file has been created.
.autocorrectrc is allows use YAML, JSON format.
Config file example:
# yaml-language-server: $schema=https://huacnlee.github.io/autocorrect/schema.json
# Config rules
rules:
# Auto add spacing between CJK (Chinese, Japanese, Korean) and English words.
# 0 - off, 1 - error, 2 - warning
space-word: 1
# Add space between some punctuations.
space-punctuation: 1
# Add space between brackets (), [] when near the CJK.
space-bracket: 1
# Add space between ``, when near the CJK.
space-backticks: 1
# Add space between dash `-`
space-dash: 0
# Convert to fullwidth.
fullwidth: 1
# To remove space near the fullwidth.
no-space-fullwidth: 1
# Fullwidth alphanumeric characters to halfwidth.
halfwidth-word: 1
# Fullwidth punctuations to halfwidth in english.
halfwidth-punctuation: 1
# Spellcheck
spellcheck: 2
# Enable or disable in a specific context
context:
# Enable or disable to format codeblock in Markdown or AsciiDoc etc.
codeblock: 1
textRules:
# Config special rules for some texts
# For example, if we wants to let "Helloä½ å¥½" just warning, and "Hiä½ å¥½" to ignore
# "Helloä½ å¥½": 2
# "Hiä½ å¥½": 0
fileTypes:
# Config the files associations, you config is higher priority than default.
# "rb": ruby
# "Rakefile": ruby
# "*.js": javascript
# ".mdx": markdown
spellcheck:
# Correct Words (Case insensitive) for by Spellcheck
words:
- GitHub
- App Store
# This means "appstore" into "App Store"
- AppStore = App Store
- Git
- Node.js
- nodejs = Node.js
- VIM
- DNS
- HTTP
- SSL
Ignore option
Since: 2.2.0
When you want to config some special words or texts to ignore on format or lint.
The textRules
config may help you.
For example, we want:
Helloä¸ç
- To just give a warning.Hiä½ å¥½
- To ignore.
Use can config:
textRules:
Helloä¸ç: 2
Hiä½ å¥½: 0
After that, AutoCorrect will follow your textRules
to process.
Ignore files
Use .autocorrectignore
to ignore files
Sometimes, you may want to ignore some special files that not want to check.
By default, the file matched .gitignore
rule will be ignored.
You can also use .autocorrectignore
to ignore other files, format like .gitignore
.
Disable by inline comment
If you just want to disable some special lines in a file, you can write a comment autocorrect-disable
,
when AutoCorrect matched the comment include that, it will disable temporarily.
And then, you can use autocorrect-enable
to reopen it again.
For example, in JavaScript:
function hello() {
// autocorrect-disable
console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨");
console.log("è¿è¡ä¹æ¯disableçç¶æ");
// autocorrect-enable
let a = "ç°å¨èµ·autocorrectåå°äºå¯ç¨çç¶æ";
}
The output will:
function hello() {
// autocorrect-disable
console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨");
console.log("è¿è¡ä¹æ¯disableçç¶æ");
// autocorrect-enable
let a = "ç°å¨èµ· autocorrect åå°äºå¯ç¨çç¶æ";
}
Disable some rules
Since: 2.0
You can use autocorrect-disable <rule>
in a comment to disable some rules.
Rule names please see: Configuration
function hello() {
// autocorrect-disable space-word
console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨.");
// autocorrect-disable fullwidth
console.log("è¿è¡ä¹æ¯disableçç¶æ.");
// autocorrect-enable
let a = "ç°å¨èµ·autocorrectåå°äºå¯ç¨çç¶æ.";
}
Will get:
function hello() {
// autocorrect-disable space-word
console.log("ç°å¨è¿è¡å¼å§autocorrectä¼ææ¶ç¦ç¨ã");
// autocorrect-disable fullwidth, space-word
console.log("è¿è¡ä¹æ¯disableçç¶æ.");
// autocorrect-enable
let a = "ç°å¨èµ· autocorrect åå°äºå¯ç¨çç¶æã";
}
VS Code Extension
https://marketplace.visualstudio.com/items?itemName=huacnlee.autocorrect
Screenshot:

Intellij Platform Plugin

https://github.com/huacnlee/autocorrect-idea-plugin
GitHub Action
https://github.com/huacnlee/autocorrect-action
Add to your .github/workflows/ci.yml
steps:
- name: Check source code
uses: actions/checkout@v4
- name: AutoCorrect
uses: huacnlee/autocorrect-action@main
GitLab CI
Add to your .gitlab-ci.yml
, to use huacnlee/autocorrect Docker image to check.
autocorrect:
stage: build
image: huacnlee/autocorrect:latest
script:
- autocorrect --lint
# Enable allow_failure if you wants.
# allow_failure: true
Work with ReviewDog
Since: 2.8.0
AutoCorrect can work with reviewdog, so you can use it in CI/CD. ReviewDog will post a comment to your PR with the AutoCorrect change suggestions. Then the PR committer can easy to accept the suggestions.
Use --format rdjson
option to output the lint results as the reviewdog supported format.
autocorrect --lint --format rdjson | reviewdog -f=rdjson -reporter=github-pr-review
Use huacnlee/autocorrect-action can help you setup GitHub Action.

Use for programming
AutoCorrect makes for support use in many programming languages.
- Rust - autocorrect
- Ruby - autocorrect-rb
- Go - autocorrect-go
- Python - autocorrect-py
- Node.js - autocorrect-node
- JavaScript (Browser) - autocorrect-wasm
- Java - autocorrect-java
Benchmark
MacBook Pro (13-inch, Apple M3, 2023)
Use make bench
to run benchmark tests.
See autocorrect/src/benches/example.rs for details.
format_050 time: [4.9991 µs 5.0175 µs 5.0382 µs]
format_100 time: [8.7714 µs 8.8236 µs 8.8896 µs]
format_400 time: [23.535 µs 23.591 µs 23.666 µs]
format_html time: [332.87 µs 334.00 µs 335.37 µs]
halfwidth_english time: [1.2051 µs 1.2079 µs 1.2110 µs]
format_json time: [54.019 µs 54.345 µs 54.855 µs]
format_javascript time: [176.61 µs 181.64 µs 187.20 µs]
format_json_2k time: [9.3245 ms 9.3768 ms 9.4390 ms]
format_jupyter time: [200.77 µs 204.93 µs 210.91 µs]
format_markdown time: [1.2216 ms 1.2246 ms 1.2283 ms]
spellcheck_50 time: [1.2098 µs 1.2162 µs 1.2234 µs]
spellcheck_100 time: [2.2592 µs 2.3049 µs 2.3861 µs]
spellcheck_400 time: [7.7480 µs 7.9111 µs 8.1764 µs]
lint_markdown time: [1.2704 ms 1.2883 ms 1.3173 ms]
lint_json time: [58.696 µs 60.847 µs 63.484 µs]
lint_html time: [448.53 µs 486.95 µs 534.01 µs]
lint_javascript time: [177.00 µs 177.88 µs 178.69 µs]
lint_yaml time: [378.35 µs 382.30 µs 387.85 µs]
lint_to_json time: [1.2629 ms 1.2689 ms 1.2769 ms]
lint_to_diff time: [1.3255 ms 1.3288 ms 1.3327 ms]
Real world benchmark
With MDN Translated Content project, it has about 30K files.
~/work/translated-content $ autocorrect --fix
AutoCorrect spend time: 8402.538ms
Other Extensions
The other implementations from the community.
User cases
License
This project under MIT license.
Top Related Projects
Fuzzy String Matching in Python
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot