docsim / UkrVectōrēs – an NLU-powered tool for knowledge discovery, classification, diagnostics and prediction. Entities similarity tool.
Published:
I would like to present you one of my pet projects – docsim / UkrVectōrēs. docsim / UkrVectōrēs – an NLU-powered tool for knowledge discovery, classification, diagnostics and prediction. Entities similarity tool.
docsim / UkrVectōrēs is open source and avaliable on GitHub: https://github.com/malakhovks/docsim.
Caution/Disclaimer
Project and documentation are in active development! For any technical clarifications and questions contact us via email malakhovks@nas.gov.ua or via Issues.
Features
You can think about docsim / UkrVectōrēs as a kind of “cognitive-semantic calculator”. The online toolkit docsim / UkrVectōrēs covers the following elements of distributional analysis:
- calculate semantic similarity between pairs of words;
- find words semantically closest to the query word;
- apply simple algebraic operations to word vectors (addition, subtraction, finding average vector for a group of words and distances to this average value);
- draw semantic maps of relations between input words (it is useful to explore clusters and oppositions, or to test your hypotheses about them);
- get the raw vectors (arrays of real values) and their visualizations for words in the chosen model;
- download default models;
- use other prognostic models distributive semantics freely distributed, by adjusting the configuration file.