Top Related Projects
Quick Overview
Codespell is a command-line tool that helps identify and fix common misspellings in text files, including source code, documentation, and other types of text-based content. It can be used to improve the quality and consistency of written materials, particularly in open-source projects and collaborative environments.
Pros
- Comprehensive Dictionary: Codespell comes with a large dictionary of common misspellings, which covers a wide range of technical and non-technical terms.
- Customizable Configuration: Users can easily customize the dictionary and configuration settings to suit their specific needs, such as adding or removing words, or adjusting the sensitivity of the spell-checking process.
- Integrates with Git: Codespell can be integrated with Git repositories, allowing developers to automatically check for and fix misspellings during the commit process.
- Supports Multiple File Types: Codespell can handle a variety of file types, including source code, Markdown, and plain text files, making it a versatile tool for various projects.
Cons
- Limited Language Support: While Codespell supports a wide range of languages, it may not be as comprehensive or accurate for languages other than English.
- Potential False Positives: Depending on the context and the specific configuration, Codespell may sometimes identify valid words or technical terms as misspellings, leading to false positives.
- Requires Manual Review: While Codespell can automatically fix many misspellings, some cases may require manual review and intervention to ensure the correct spelling is used.
- Dependency on External Dictionary: Codespell relies on an external dictionary, which may not always be up-to-date or comprehensive, potentially missing some common misspellings.
Code Examples
N/A (Codespell is not a code library)
Getting Started
To get started with Codespell, follow these steps:
- Install Codespell using pip:
pip install codespell
- Run Codespell on a file or directory:
codespell path/to/file.txt
This will scan the file for misspellings and display the suggested corrections.
- To fix the misspellings automatically, use the
-i
or--in-place
option:
codespell -i path/to/file.txt
This will update the file with the corrected spellings.
- You can also run Codespell on an entire directory:
codespell path/to/directory
This will scan all the files in the directory and its subdirectories for misspellings.
- To customize the dictionary or configuration, create a
.codespellrc
file in the root directory of your project. The file should contain the desired settings, such as:
[codespell]
ignore-words-list = myword1,myword2
This will tell Codespell to ignore the specified words during the spell-checking process.
For more advanced usage and configuration options, refer to the Codespell documentation.
Competitor Comparisons
The most popular spellchecking library.
Pros of Hunspell
- Hunspell is a widely used spell-checking library, with support for over 100 languages.
- It has a robust and well-documented API, making it easy to integrate into various applications.
- Hunspell is actively maintained and has a large community of contributors.
Cons of Hunspell
- Hunspell's codebase is larger and more complex than Codespell, which may make it more difficult to understand and contribute to.
- Hunspell's focus on supporting a wide range of languages may make it less optimized for specific use cases, such as code spell-checking.
Code Comparison
Codespell:
def check_word(self, word, suggestions=True):
"""
Check a single word for spelling errors.
Args:
word (str): The word to check.
suggestions (bool, optional): Whether to return suggested corrections.
Defaults to True.
Returns:
tuple: A tuple containing a boolean indicating whether the word is
misspelled, and a list of suggested corrections (if any).
"""
# Code to check the word and return suggestions
pass
Hunspell:
int Hunspell::spell(const char *word)
{
int captype = get_captype(word, strlen(word));
struct hentry * he = lookup(word);
if (he) {
if (HUNSPELL_STEM(he)) return 1;
if (HUNSPELL_MORPH(he)) return 1;
if (captype == NOCAP) return 1;
if (captype == INITCAP && HUNSPELL_KEEPCASE(he)) return 0;
if (captype == ALLCAP && HUNSPELL_KEEPCASE(he)) return 0;
return 1;
}
return 0;
}
check code for common misspellings
Pros of codespell-project/codespell
- Actively maintained with regular updates and bug fixes
- Supports a wide range of file types, including code, documentation, and configuration files
- Provides a comprehensive list of common misspellings to check for
Cons of codespell-project/codespell
- May not catch all potential misspellings, especially domain-specific or context-dependent ones
- Can produce false positives in some cases, requiring manual review
- Requires installation and setup, which may not be suitable for all users
Code Comparison
Here's a brief comparison of the code between the two repositories:
codespell-project/codespell:
def check_file(filename, quiet=False, ignore_words_file=None, ignore_words=None,
check_filenames=False, dictionary=None, regex=None, exclude_file=None,
exclude_list=None, num_threads=None, use_builtin_dict=False,
write_changes=False, interactive=False, config_file=None,
color=True, **kwargs):
"""
Check the given file for spelling mistakes.
...
"""
# Code implementation
codespell-project/codespell:
def check_file(filename, quiet=False, ignore_words_file=None, ignore_words=None,
check_filenames=False, dictionary=None, regex=None, exclude_file=None,
exclude_list=None, num_threads=None, use_builtin_dict=False,
write_changes=False, interactive=False, config_file=None,
color=True, **kwargs):
"""
Check the given file for spelling mistakes.
...
"""
# Code implementation
As you can see, the code is nearly identical between the two repositories, with only minor differences in the function signature and implementation.
Static Type Checker for Python
Pros of Pyright
- Pyright is a static type checker for Python, which can help catch type-related errors early in the development process.
- Pyright supports a wide range of Python features, including type annotations, type inference, and support for popular Python libraries.
- Pyright is actively maintained by Microsoft and has a large and growing community of contributors.
Cons of Pyright
- Pyright is a relatively new tool, and may not have the same level of maturity and feature set as some other Python linting and type-checking tools.
- Pyright is primarily focused on type-checking, and may not provide the same level of functionality as more comprehensive linting tools.
Code Comparison
Codespell-project/codespell:
def main():
parser = argparse.ArgumentParser(description='Fix common misspellings in text files.')
parser.add_argument('files', nargs='*', help='Files to check')
parser.add_argument('-i', '--in-place', action='store_true', help='Edit files in-place')
parser.add_argument('-w', '--write', action='store_true', help='Write changes to disk')
parser.add_argument('-q', '--quiet', action='store_true', help='Only print error messages')
parser.add_argument('-S', '--skip', action='append', help='Skip these files or directories')
parser.add_argument('-C', '--no-color', action='store_true', help='Disable color output')
parser.add_argument('-L', '--list-misspellings', action='store_true', help='List misspellings and exit')
parser.add_argument('-D', '--dictionary', action='store', help='Use custom dictionary file')
args = parser.parse_args()
# ...
Microsoft/Pyright:
def check_type_compatibility(
source_type: Type[Any],
target_type: Type[Any],
*,
is_assignment: bool = False,
is_return: bool = False,
is_parameter: bool = False,
is_call_argument: bool = False,
is_subscript_index: bool = False,
is_await: bool = False,
is_yield: bool = False,
is_union_member: bool = False,
ignore_callable_params: bool = False,
ignore_inferred_optional: bool = False,
ignore_protocol_members: bool = False,
ignore_type_params: bool = False,
ignore_return_type: bool = False,
ignore_bases: bool = False,
ignore_class_type_args: bool = False,
ignore_unbound_type_params: bool = False,
ignore_contravariance: bool = False,
ignore_covariance: bool = False,
ignore_unsatisfied_type_parameters: bool = False,
ignore_missing_type_arguments: bool = False,
ignore_inferred_type_args: bool = False,
ignore_protocol_type_args: bool = False,
) -> TypeCheckResult:
# ...
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
codespell
Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code (backslash escapes are skipped), but it can be used with other files as well. It does not check for word membership in a complete dictionary, but instead looks for a set of common misspellings. Therefore it should catch errors like "adn", but it will not catch "adnasdfasdf". This also means it shouldn't generate false-positives when you use a niche term it doesn't know about.
Useful links
-
GitHub project <https://github.com/codespell-project/codespell>
_ -
Repository <https://github.com/codespell-project/codespell>
_ -
Releases <https://github.com/codespell-project/codespell/releases>
_
Requirements
Python 3.8 or above.
Installation
You can use pip
to install codespell with e.g.:
.. code-block:: sh
pip install codespell
Usage
Below are some simple usage examples to demonstrate how the tool works.
For exhaustive usage information, please check the output of codespell -h
.
Run codespell in all files of the current directory:
.. code-block:: sh
codespell
Run codespell in specific files or directories (specified via their names or glob patterns):
.. code-block:: sh
codespell some_file some_dir/ *.ext
Some noteworthy flags:
.. code-block:: sh
codespell -w, --write-changes
The -w
flag will actually implement the changes recommended by codespell. Running without the -w
flag is the same as doing a dry run. It is recommended to run this with the -i
or --interactive
flag.
.. code-block:: sh
codespell -I FILE, --ignore-words=FILE
The -I
flag can be used for a list of certain words to allow that are in the codespell dictionaries. The format of the file is one word per line. Invoke using: codespell -I path/to/file.txt
to execute codespell referencing said list of allowed words. See Ignoring Words
_ for more details.
.. code-block:: sh
codespell -L word1,word2,word3,word4
The -L
flag can be used to allow certain words that are comma-separated placed immediately after it. See Ignoring Words
_ for more details.
.. code-block:: sh
codespell -x FILE, --exclude-file=FILE
Ignore whole lines that match those in FILE
. The lines in FILE
should match the to-be-excluded lines exactly.
.. code-block:: sh
codespell -S, --skip=
Comma-separated list of files to skip. It accepts globs as well. Examples:
-
to skip .eps & .txt files, invoke
codespell --skip="*.eps,*.txt"
-
to skip directories, invoke
codespell --skip="./src/3rd-Party,./src/Test"
Useful commands:
.. code-block:: sh
codespell -d -q 3 --skip="*.po,*.ts,./src/3rdParty,./src/Test"
List all typos found except translation files and some directories. Display them without terminal colors and with a quiet level of 3.
.. code-block:: sh
codespell -i 3 -w
Run interactive mode level 3 and write changes to file.
We ship a collection of dictionaries that are an improved version of the one available
on Wikipedia <https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines>
_
after applying them in projects like Linux Kernel, EFL, oFono among others.
You can provide your own version of the dictionary, but patches for
new/different entries are very welcome.
Want to know if a word you're proposing exists in codespell already? It is possible to test a word against the current set dictionaries that exist in codespell_lib/data/dictionary*.txt
via:
.. code-block:: sh
echo "word" | codespell -
echo "1stword,2ndword" | codespell -
You can select the optional dictionaries with the --builtin
option.
Ignoring words
When ignoring false positives, note that spelling errors are case-insensitive but words to ignore are case-sensitive. For example, the dictionary entry wrod
will also match the typo Wrod
, but to ignore it you must pass wrod
.
The words to ignore can be passed in two ways:
-
-I
: A file with a word per line to ignore:.. code-block:: sh
codespell -I FILE, --ignore-words=FILE
-
-L
: A comma separated list of words to ignore on the command line:.. code-block:: sh
codespell -L word1,word2,word3,word4
Inline ignore
Some situation might require ignoring a specific word in a specific location. This can be achieved by adding a comment in the source code.
You can either ignore a single word or a list of words. The comment should be in the format of codespell:ignore <words>
.
Words should be separated by a comma.
-
ignore specific word:
.. code-block:: python
def wrod() # codespell:ignore wrod pass
-
ignore multiple words:
.. code-block:: python
def wrod(wrods) # codespell:ignore pass
Using a config file
Command line options can also be specified in a config file.
When running codespell
, it will check in the current directory for a file
named setup.cfg
or .codespellrc
(or a file specified via --config
),
containing an entry named [codespell]
. Each command line argument can
be specified in this file (without the preceding dashes), for example:
.. code-block:: ini
[codespell]
skip = *.po,*.ts,./src/3rdParty,./src/Test
count =
quiet-level = 3
The .codespellrc
file is an INI file <https://en.wikipedia.org/wiki/INI_file>
,
which is read using Python's
configparser <https://docs.python.org/3/library/configparser.html#supported-ini-file-structure>
.
For example, comments are possible using ;
or #
as the first character.
Values in an INI file entry cannot start with a -
character, so if you need to do this,
structure your entries like this:
.. code-block:: ini
[codespell]
dictionary = mydict,-
ignore-words = bar,-foo
instead of these invalid entries:
.. code-block:: ini
[codespell]
dictionary = -,mydict
ignore-words = -foo,bar
Codespell will also check in the current directory for a pyproject.toml
(or a path can be specified via --toml <filename>
) file, and the
[tool.codespell]
entry will be used, but only if the tomli_ package
is installed for versions of Python prior to 3.11. For example:
.. code-block:: toml
[tool.codespell]
skip = '*.po,*.ts,./src/3rdParty,./src/Test'
count = true
quiet-level = 3
These are both equivalent to running:
.. code-block:: sh
codespell --quiet-level 3 --count --skip "*.po,*.ts,./src/3rdParty,./src/Test"
If several config files are present, they are read in the following order:
#. pyproject.toml
(only if the tomli
library is available)
#. setup.cfg
#. .codespellrc
#. any additional file supplied via --config
If a codespell configuration is supplied in several of these files, the configuration from the most recently read file overwrites previously specified configurations.
Any options specified in the command line will override options from the config files.
.. _tomli: https://pypi.org/project/tomli/
pre-commit hook
codespell also works with pre-commit <https://pre-commit.com/>
_, using
.. code-block:: yaml
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell
If one configures codespell using the pyproject.toml
file instead use:
.. code-block:: yaml
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
hooks:
- id: codespell
additional_dependencies:
- tomli
- id: codespell
additional_dependencies:
Dictionary format
The format of the dictionaries was influenced by the one they originally came from, i.e. from Wikipedia. The difference is how multiple options are treated and that the last argument is an optional reason why a certain entry could not be applied directly, but should instead be manually inspected. E.g.:
-
Simple entry: one wrong word / one suggestion::
calulated->calculated
-
Entry with more than one suggested fix::
fiel->feel, field, file, phial,
Note the last comma! You need to use it, otherwise the last suggestion will be discarded (see below for why). When there is more than one suggestion, an automatic fix is not possible and the best we can do is to give the user the file and line where the error occurred as well as the suggestions.
-
Entry with one word, but with automatic fix disabled::
clas->class, disabled because of name clash in c++
Note that there isn't a comma at the end of the line. The last argument is treated as the reason why a suggestion cannot be automatically applied.
There can also be multiple suggestions but any automatic fix will again be disabled::
clas->class, clash, disabled because of name clash in c++
Development setup
As suggested in the Python Packaging User Guide
_, ensure pip
, setuptools
, and wheel
are up to date before installing from source. Specifically you will need recent versions of setuptools
and setuptools_scm
:
.. code-block:: sh
pip install --upgrade pip setuptools setuptools_scm wheel
You can install required dependencies for development by running the following within a checkout of the codespell source:
.. code-block:: sh
pip install -e ".[dev]"
To run tests against the codebase run:
.. code-block:: sh
make check
.. _Python Packaging User Guide: https://packaging.python.org/en/latest/tutorials/installing-packages/#requirements-for-installing-packages
Sending pull requests
If you have a suggested typo that you'd like to see merged please follow these steps:
-
Make sure you read the instructions mentioned in the
Dictionary format
section above to submit correctly formatted entries. -
Choose the correct dictionary file to add your typo to. See
codespell --help
for explanations of the different dictionaries. -
Sort the dictionaries. This is done by invoking (in the top level directory of
codespell/
):.. code-block:: sh
make check-dictionaries
If the make script finds that you need to sort a dictionary, please then run:
.. code-block:: sh
make sort-dictionaries
-
Only after this process is complete do we recommend you submit the PR.
Important Notes:
- If the dictionaries are submitted without being pre-sorted the PR will fail via our various CI tools.
- Not all PRs will be merged. This is pending on the discretion of the devs, maintainers, and the community.
Updating
To stay current with codespell developments it is possible to build codespell from GitHub via:
.. code-block:: sh
pip install --upgrade git+https://github.com/codespell-project/codespell.git
Important Notes:
-
Sometimes installing via
pip
will complain about permissions. If this is the case then run with:.. code-block:: sh
pip install --user --upgrade git+https://github.com/codespell-project/codespell.git
-
It has been reported that after installing from
pip
, codespell can't be located. Please check the $PATH variable to see if~/.local/bin
is present. If it isn't then add it to your path. -
If you decide to install via
pip
then be sure to remove any previously installed versions of codespell (via your platform's preferred app manager).
Updating the dictionaries
In the scenario where the user prefers not to follow the development version of codespell yet still opts to benefit from the frequently updated dictionary files, we recommend running a simple set of commands to achieve this:
.. code-block:: sh
wget https://raw.githubusercontent.com/codespell-project/codespell/master/codespell_lib/data/dictionary.txt
codespell -D dictionary.txt
The above simply downloads the latest dictionary.txt
file and then by utilizing the -D
flag allows the user to specify the freshly downloaded dictionary.txt
as the custom dictionary instead of the default one.
You can also do the same thing for the other dictionaries listed here: https://github.com/codespell-project/codespell/tree/master/codespell_lib/data
License
The Python script codespell
with its library codespell_lib
is available
with the following terms:
(tl;dr: GPL v2
_)
Copyright (C) 2010-2011 Lucas De Marchi lucas.de.marchi@gmail.com
Copyright (C) 2011 ProFUSION embedded systems
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, see https://www.gnu.org/licenses/old-licenses/gpl-2.0.html.
.. _GPL v2: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
dictionary.txt
and the other dictionary_*.txt
files are derivative works of English Wikipedia and are released under the Creative Commons Attribution-Share-Alike License 3.0 <https://creativecommons.org/licenses/by-sa/3.0/>
_.
Top Related Projects
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot