util/devel/chplspell is a script to assist in spell-checking the
Chapel documentation and source code. It is a wrapper around the
scspell source-code spell-checker.
chplspell provides four main conveniences over simply using
It has built-in knowledge of which files and directories in the Chapel repository benefit from being spell-checked.
scspellto use the project dictionary file
It recurses through directories given on the command line.
It invokes the
scspellthat’s installed in the Chapel virtualenv.
This document describes some basic information about
use cases, a description of the dictionary files, and some less common
use cases related to management of the dictionary files.
chplspell depends on
scspell being installed in the virtualenv. To
install it, use
scspell (and thus
chplspell) has two main modes of invocation:
interactive and non-interactive.
chplspell further provides two ways
of using each mode: spell-checking the whole project or only specific
files or directories.
chplspell maintains a project dictionary for Chapel in
$CHPL_HOME/util/devel/chplspell-dictionary. This dictionary file
contains several types of word lists, as supported by
- Natural language dictionary
Words that may be found in any file.
- Programming-language dictionaries
Words that may be found in certain types of files, identified by file extension.
- File-specific dictionaries
Words that may be found only in particular files.
chplspell passes nearly all command line options through to
scspell, and reports
scspell’s usage message when invoked with
--help. See the scspell page on python.org for information
scspell’s command line arguments, approach to spell
checking source code, and user interface.
chplspell adjusts the command line in several ways:
chplspellpasses options to
scspelldirecting it at
If no files or directories are given on the command line,
scspellon a default set of files and directories that make sense for the Chapel repository.
If directories are given on the command line,
scspellon all files of certain types within those directories, recursively.
In case 2 or 3 above,
scspelltwice: once for most of the files it finds, and again for any LaTeX files it finds. This is because LaTeX does not use C-style escapes.
This is a minor point, relevant only to understanding why, when you hit ^C,
chplspellkeeps spell-checking; and why words ignored with the “Ignore all” interactive command are forgotten when the LaTeX portion begins.
The configuration of the “default set of files and directories” is at
the top of the
chplspell script, and may be easily altered.
The simplest use is to produce a non-interactive report for the default files and directories of the Chapel repository. As of the date of this writing, there were still some words yet to be added to the project dictionary or corrected, which make a good example:
$ chplspell --report-only doc/rst/developer/bestPractices/README.md:50: 'chplspell' not found in dictionary (from token 'chplspell') CHANGES.md:445: 'chpldocumentation' not found in dictionary (from token 'chpldocumentation') CHANGES.md:2001: 'pshm' not found in dictionary (from token 'pshm') CHANGES.md:2341: 'pshm' not found in dictionary (from token 'pshm') CHANGES.md:3360: 'circularities' not found in dictionary (from token 'circularities')
chplspell may also be invoked on only particular files or directories
named on the command line. For example, this file with one tyop:
$ chplspell --report-only doc/rst/developer/bestPractices/SpellChecking.rst doc/rst/developer/bestPractices/SpellChecking.rst:109: 'tyop' not found in dictionary (from token 'tyop') $
(The project dictionary now includes the word “tyop” for this file, so the above command no longer produces that result.)
scspell provides an interactive mode for making corrections and for
adding words to the various dictionaries. This mode is also available
See the scspell page on python.org for details.
chplspell’s invocation of
scspell makes any requested
dictionary changes to
Dictionary file details¶
This section provides a few details about the format of
dictionary file. Understanding of these details is not necessary to
make use of chplspell as described above. It will be helpful in
making use of the more advanced options in the next section.
The natural language word list contains the words that may appear in
any file being spell checked. It is the last word list in the
dictionary file, under the heading
A “programming language” word list is used in addition to the natural
language word list when the file being checked matches one of the file
extensions given for that word list. They appear in the dictionary
file on lines beginning with
FILETYPE: TeX/LaTeX; .tex, .bib
A file-specific word list is used in addition when a file has a
matching “file id”. These are stored in the dictionary file under
FILEID: headings, e.g.
There are two ways that a file id’s association with a file may be
The file contains the string “scspell-id: ” followed by a file id; e.g., in a comment.
There is an entry in the “file id mapping file”,
$CHPL_HOME/util/devel/chplspell-dictionary.fileids.json, associating the file name to the file id. For example, the following file id is associated with two files in the Chapel repository:
"63b96a22-1e46-11e6-a3a6-10ddb1d4c3d5": [ "doc/rst/developer/hdfs_and_chapel/API.tex", "doc/rst/developer/hdfs_and_chapel/examples.tex" ],
If a file has a file id associated, when
scspell offers to add an
unrecognized word to a dictionary, one of the offered dictionaries is
If there is no file id associated with the file,
instead offer the option to create a
dictionary. This option will create the new file id, add it to the
dictionary.fileids.json file, and add the unrecognized word to
that file-specific word list in the
If a file with a file-specific word list is moved or copied (e.g., the
shootout benchmarks), and the association is via the file id mapping,
chplspell won’t have the existing word list associated with the
new file. The next section describes several ways to remedy this
situation and similar ones without creating duplicate file-specific word
As of this writing, no files in the Chapel repository contain a file id literal; all file id mappings are done through the file id mapping file.
Dictionary file management options¶
scspell’s –rename-file option available to
update the file id map after a file has been renamed:
git mv path/to/old.chpl new/path/and/new.chpl chplspell --rename-file path/to/old.chpl new/path/and/new.chpl
Unfortunately there is not yet a straight-up
scspell also provides a –merge-file-ids option for the case that two
files have file-specific word lists, and the word lists are similar enough
that they should be merged. The file ids may be given by the file id
literal string, or by the name of a file associated with the file id:
chplspell --merge-file-ids one/file.chpl a/similar/file.chpl
The only impact of the order is which file id hex string ends up associated with the files.
--delete-files option to
chplspell may be used to remove the
association between a file id and a deleted file from the dictionary
file. If that was the only file associated with that file id,
chplspell will also remove the file id itself and the file-specific
git rm doc/obsolete doc/archaic.md chplspell --delete-files doc/obsolete doc/archaic.md
Edit the dictionary.fileids.json file¶
You can edit the file by hand to add a filename to a file id, or change a filename. The format is straightforward JSON.
One minor detail (likely of interest only to those so hung up on
minutiae as to write a spell checking utility) is that while
emits the file id mapping file with no trailing newline, most text
editors take some convincing to save a file that way. To avoid git
commits fighting over that last byte, it’d be considerate to get rid
of that newline before committing.
pico -L is the simplest way I’ve found. Otherwise, you can make
the change, then invoke
chplspell to get it to re-write the file. The
file will be rewritten only if there are changes to make to it, so
you’ll likely need to make two changes that add up to no effect, such
as the sequence
chplspell --rename-file CONTRIBUTORS.md SCHMONTRIBUTORS.md chplspell --rename-file SCHMONTRIBUTORS.md CONTRIBUTORS.md
(No files are renamed by this – these operations manipulate only the file id mapping.)