Tweets Tokenizers
The tokenizers/tweets
module gathers the library’s tweets tokenizers.
The aim of those function is to split tweets into tokens relevant to further analysis.
Summary
casual
Reference: http://www.nltk.org/api/nltk.tokenize.html#module-nltk.tokenize.casual
Authors:
Christopher Potts
Ewan Klein
Pierpaolo Pantone
JavaScript implementation of nltk’s tweets tokenizers.
This tokenizer is aware of urls, handles, hashtags, some emoticons etc.
import casual from 'talisman/tokenizers/tweets/casual';
casual('This is a cooool #dummysmiley: :-)');
>>> [
'This',
'is',
'a',
'cooool',
'#dummysmiley',
':',
':-)'
]