An editor for dependency treebanks

I was pleased to meet Johannes Heinecke at the International Congress of Celtic Studies in Bangor last week. As well as producing a dependency treebank for Welsh, he has written a rather smart editor for CoNLL-U files, which are pretty much the standard these days for dependency trees.

Screengrab of Johannes Heinecke's CoNLL-U editor. The tree is for the sentence "Cuir d' ainm ri seo."

I managed to get it working this morning on a Mac running Mac OS Mojave 10.14.6 with a minimum of hassle. You will need Java, Apache Maven, and Homebrew in order to install wget. One small surprise is that if you edit a file in a git repository then by default every time you edit the tree, the new file is committed, which makes the commit history look a bit busy.

The second best bit is that you can see non-projective relations at a glance, which I certainly can’t do in emacs.

The best bit, as someone who recently wrote a paper where all the arrows in the dependency diagrams pointed the wrong way and didn’t notice until the referees pointed it out, is that there is a wee button you can click on to get a tikz version of the tree for pasting into LaTeX.

Useful site full of worked English-language examples

Much of the literature on categorial grammar focuses on things that are difficult to handle in other frameworks and isn’t necessarily helpful if you want to find something simple. However, there are lots and lots of worked examples on the Groningen Meaning Bank Explorer. More about how it works here.