Behind this blog is a happy half hour or so I spent on Friday evening writing a bit of code to do forward composition in a really simple-minded way. Forward composition, in categorial grammar, is applying the following rule:
X/Y Y → X
So I, being simple minded, wrote a bit of code that
- Tokenized a sentence on spaces, and separated out punctuation marks (actually just the question mark).
- Converted that tokenlist into a long string of types.
- Applied forward composition to remove any like types next to each other and the slash that preceded them.
- Printed out what remains.
My sentences are:
- Dè tha dol? What goes? What’s happening?
- An robh fhios agaibh? Did you know?
Let’s try the first one. The output of the program looks like this:
S[int]/?/VPVP/VNVN? S[int]/?/VPVP? S[int]/?? S[int]
based on the following types:
- Dè → S[int]/?/VP
- S[int] indicates that the sentence is interrogative.
- /? is there to digest the question mark at the end.
- tha → VP/VN
- give me a verbal noun, of which more soon, and I will give you a VP
- dol → VN
- ? → ?
I’m not happy with this grammar, which I wrote on Friday evening. But I leave it here for demonstration purposes.
- This is a bit of a cheat, because the code doesn’t actually give you a proper bracketed parse.
- Also, it only allows for each lexical item to have a single type, which is wrong.
- But, despite all that and it being almost the simplest piece of code I’ve ever written, it identifies that those sentences are well-formed. If you try feeding it a sentence with the same words in a different order, it doesn’t give you S[int] out at the end.
- Something’s gone horribly wrong with the character encoding. You will just have to imagine the letter è in Dè tha dol? But I’m not going to investigate at this time of night.
- Code in haste, blog at leisure. It’s taken longer to write these 300-odd words than it did to produce the code in the first place.
Available here: ForwardComposition.java
Coming soon: To be and to be (2).