Hyphenation Tokenizers
The tokenizers/hyphenation
module gathers the library’s various hyphenation algorithms.
Hyphenation algorithms take raw words and split them into parts that can be separated by hyphens when needing to justify text.
Summary
liang
Reference: https://tug.org/docs/liang/
Liang, Franklin Mark. “Word Hy-phen-a-tion by Com-pu-ter”. PhD dissertation, Stanford University Department of Computer Science. Report number STAN-CS-83-977, August 1983.
JavaScript implementation of Liang’s hyphenation.
Note that this version stores patterns targeting the English language so you might want to avoid using it on other languages.
import liang from 'talisman/tokenizers/hyphenation/liang';
liang('university');
>>> ['uni', 'ver', 'si', 'ty']