Mnemonist

SymSpell


The Symmetric delete spelling correction index is a data-structure used for spelling correction & fuzzy search.

It has been designed by Wolfe Garbe and aims at being more performant that previous solutions such as the Burkhard-Keller.

However, it really shines when you want the max edit distance between your queries & items stored in the index to be quite small - 2 or 3 for instance. Else it begins to need a lot of space and its efficiency will drop.

For more information, you can head toward this seminal blog post.

const SymSpell = require('mnemonist/symspell');

Use case

Let’s say we want to build an autocomplete system.

When the user inputs a string, we are going to search for every term we know being at most at a Levenshtein distance of 2 of the user’s query.

The naive method would be to “brute-force” the list of terms likewise:

const suggestions = terms.filter(term => {
  return levenshtein(term, query) <= 2;
});

But, even if this works with few terms, it will soon become hard to compute if the list of terms grows too much.

SymSpell indexes solves this problem by indexing the list of terms such as it becomes efficient to query them in a fuzzy way.

const index = SymSpell.from(terms);

// We can now query the index easily:
const suggestions = index.search(query);

Constructor

The SymSpell takes a single optional argument being some options.

const index = new SymSpell(options);

Example with a higher max distance

const index = new SymSpell({
  maxDistance: 3
});

Options

  • maxDistance [number=2] - maximum edit distance between stored terms and query.
  • verbosity [number=2] - verbosity of the index:
    • 0: Returns only the top suggestion.
    • 1: Returns suggestions with the smallest edit distance.
    • 2: Returns every found suggestion.

Static #.from

Alternatively, one can build a SymSpell from an arbitrary JavaScript iterable likewise:

const index = SymSpell.from(['hello', 'mello'], options);

Members

Methods

Mutation

Read

#.size

Total number of items stored in the index.

const index = new SymSpell();

index.add('hello');

index.size
>>> 1

#.add

Adds a single item to the index.

const index = new SymSpell();

index.add('hello');

#.clear

Completely clears the index of its items.

const index = new SymSpell();

index.add('hello');
index.clear();

index.size
>>> 0

Returns every item relevant to the given query.

const index = new SymSpell();

index.add('hello');
index.add('mello');
index.add('roman');

index.search('vello');
>>> [
  {item: 'hello', distance: 1, count: 1}
  {item: 'mello', distance: 1, count: 1}
]