Computational linguistics – tools
Here you will find natural language processing tools developed by our Computational linguistics staff.
To see our corpora and other resourcestools, please turn to this page:
Computational linguistics – corpora and resources
Tools
The current tools we distribute include the word aligner eflomal and the PoS-tagger and named entity recognizer efselab. Older tools can be found further down on the page.
Efficient Low-Memory Aligner (eflomal)
eflomal is a highly efficient word alignment tool. It is freely available via GitHub:
Publications
Technical details can be found in the following article:
Efficient Word Alignment with Markov Chain Monte Carlo (ufal.mff.cuni.cz/pbml)
(Robert Östling and Jörg Tiedemann, 2016)
Efficient Sequence Labeling (efselab)
efselab is a compiler for sequence labeling tools, aimed at producing accurate and very fast part-of-speech (PoS) taggers and named entity recognizers (NER).
It is freely available via GitHub.
To efselab (github.com)
Publications
A detailed description of the algorithms used along with evaluations can be found in the following paper:
Part of speech tagging: Shallow or deep learning? (nejlt.ep.liu.se)
(Robert Östling, 2018)
Legacy NLP tools
Some of our older tools are still in use. You will find them below.
Stockholm TreeAligner
Stockholm TreeAligner is a tool for aligning and searching parallel treebanks. This tool allows you to create alignment links between corresponding nodes (or words) in two treebanks in different languages.
Stockholm TreeAligner was a collaboration project between the Computational Linguistics Groups of Stockholm University and the University of Zürich.
Explore Stockholm TreeAligner at University of Zurich (cl.uzh.ch)
The Stockholm Tagger (Stagger)
Stagger is a Swedish part-of-speech tagger. Stagger has now been replaced by efselab (see above) but is still available on GitHub:
Last updated: January 21, 2025
Source: Department of Linguistics