Ngrams

Reference:
https://en.wikipedia.org/wiki/N-gram

The tokenizers/ngrams module gather functions used to compute ngrams from the given sequences.

n-grams are a sequence’s subsequences of size n.

import ngrams from 'talisman/tokenizers/ngrams';
// Alternatively, you can use these convenient shortcuts
import {
  bigrams,
  trigrams,
  quadrigrams
} from 'talisman/tokenizers/ngrams';

ngrams(2, ['The', 'cat', 'is', 'happy']);
>>> [
  ['The', 'cat'],
  ['cat', 'is'],
  ['is', 'happy']
]

trigrams(['The', 'cat', 'is', 'happy'])
>>> [
  ['The', 'cat', 'is'],
  ['cat', 'is', 'happy']
]

Arguments

n number - size of the subsequences.
sequence array - the target sequence.