Mnemonist

Vantage Point Tree


A VPTree is a data structure used to index items in an arbitrary metric space so one can then perform efficient nearest neighbors queries.

It works by choosing a point in the dataset and making it the “vantage point” of the node then splitting the remaining points into two children nodes one storing the nearest points and the other the farthest ones. It continues recursively until all points have been stored in the tree.

However, one should keep in mind that a VPTree has worst cases - mostly due to median ambiguity - and will often build trees that are not perfectly balanced. This is frequently the case, for instance, with datasets where everything stands close together or that are too small.

For more information about the VPTree, you can head here.

const VPTree = require('mnemonist/vp-tree');

Constructor

Since you need to know the whole list of items to be indexed beforehand, you cannot construct the VPTree. Instead, you can only build one from an iterable using the static #.from method.

Static #.from

You need two things to be able to build a VPTree: a list of items to index and an arbitraty distance metric to use (the Levenshtein distance, for instance).

Note that the provided metric distance must be a true metric else the tree cannot work properly.

const tree = VPTree.from(['hello', 'mello'], distance);

The tree is built in O(n log n) time.

Members

Methods

Read

#.size

Total number of items stored in the tree.

const tree = new VPTree(distance, ['hello']);

tree.size
>>> 1

#.nearestNeighbors

Returns the k nearest neighbors of the given query. Note that the resulting array will be ordered with by ascending distance.

O(log n + m), m being the number of matches.

const words = [
  'book',
  'back',
  'bock',
  'lock',
  'mack',
  'shock',
  'ephemeral'
];

const tree = new VPTree(levenshtein, words);

// Retrieving the 2 nearest neighbors of "look"
tree.nearestNeighbors(2, 'look');
>>> [
  {distance: 1, item: 'lock'},
  {distance: 1, item: 'book'} 
]

#.neighbors

Return all the neighbors of the given query for the given radius. Note that the resulting array will not be ordered.

O(log n + m), m being the number of matches.

const words = [
  'book',
  'back',
  'bock',
  'lock',
  'mack',
  'shock',
  'ephemeral'
];

const tree = new VPTree(levenshtein, words);

// Retrieving all the neighbors of "look" at a maximum distance of 2
tree.neighbors(2, 'look');
>>> [
  {distance: 1, item: 'lock'},
  {distance: 1, item: 'book'},
  {distance: 2, item: 'bock'}
]