French stemmers

The stemmers/french module gathers stemmers for the french language.

Summary

Modules under the talisman/stemmers/french namespace:

carry

Reference: http://www.otlet-institute.org/docs/Carry.pdf

Carry, un algorithme de désuffixation pour le français. M. Paternostre, P. Francq, J. Lamoral, D. Wartel et M. Saerens. 2002

Carry is a French stemmer that was designed to beat the French version of the Porter algorithm. Its name is a pun based on the fact that the verb “porter” means “to carry” in French.

Note that the algorithm has been slightly modified by me to improve some obvious cases.

import carry from 'talisman/stemmers/french/carry';

carry('Tissaient');
>>> 'tis'

eda

Reference: https://cedric.cnam.fr/fichiers/RC1314.pdf

Extraction automatique des diagnostics à partir des comptes rendus médicaux textuels. Didier Nakache, 2007.

The EDA French stemmer was specially designed to handle words from the medical field.

import eda from 'talisman/stemmers/french/eda';

eda('intestinales');
>>> 'intestin'

porter

Reference: http://snowball.tartarus.org/algorithms/french/stemmer.html

An implementation of the French Porter stemmer, ported from the Snowball version.

import porter from 'talisman/stemmers/french/porter';

porter('abaissait');
>>> 'abaiss'

unine

Reference: http://members.unine.ch/jacques.savoy/clef/

Implementation of both UniNE (University of Neuchâtel) stemmers by Jacques Savoy.

There is a “minimal” one, very simple but not very accurate, and a “complex” one handling more cases.

import unine, {minimal, complex} from 'talisman/stemmers/french/unine';

// Default export is the minimal version
unine === minimal;
>>> true

minimal('chanter');
>>> 'chant'

complex('pratiquement');
>>> 'pratiqu'

Minimal version

Complex version