An interesting case of coordination

A few weeks ago I spotted this from @BBCAimsir (the weather in Gaelic) on Twitter:


which said (just in case the embedding stops working at some future date):

Tha i blàth agus sinn air 20C a ruighinn an Glaschu agus na Criochan.

Literally “It is warm and we have reached 20 degrees Celsius in Glasgow and the Borders”. What is interesting about it is that it’s coordinating two non-constituents, in English “it… warm” and “we… reached”. This is the sort of thing that CCG is good at handling.

I wonder how common non-constituent coordination like this is in Gaelic, though?

Posted in grammar | Leave a comment

Celtic Language Technology Workshop at COLING 2014

There hasn’t been an enormous amount of work done on the Celtic languages in the fields of computational linguistics or natural language processing, so I was very pleased to see that COLING this year has a workshop on them: http://fionlive2.dcu.ie/cltw2014/

I was even more pleased, and really rather surprised, to have a short paper accepted. More on this soon, but for the full story you’ll have to come to Dublin in August.

Many thanks to one of the organizers, Teresa Lynn, who drew my attention to this meeting in the first place.

Posted in publications | Leave a comment

What particles do

Most words in categorial grammar are functions. In English, a transitive verb such as “eats” is a function that takes two NP arguments and gives you a clause, S, back. The notation for this is (S/NP)\NP. (Aside: This is rather like defining a function in a programming language, except that void isn’t a type.)

What does this functional approach tell us about particles like a and chan? To answer this I’ll need to set out the different sort of clauses I’ve seen in Scottish Gaelic. The notation here is based on CCGbank, which itself is based on that of the Penn Treebank, and I’ve marked new ones as such.

  • S[adj]: predicative adjective. Example: snog in Tha i snog.
  • S[dcl]: ordinary declarative sentence. Tha i snog.
  • S[q]: polar question. A bheil i snog?
  • (new) S[neg]: negative question. Chan eil i snog.
  • (new) S[negq]: negative polar question. Nach eil i snog?
  • S[wh]: wh-question: Ciamar a tha thu?
  • (new) S[n]: verbal noun-headed small clause. iarraidh cofaidh in Tha mi ag iarraidh cofaidh.
  • S[em]: embedded declarative. a tha thu in Ciamar a tha thu?
  • (new) S[dep]: dependent verb-headed clause. bheil i snog in A bheil i snog?
  • (new) S[a]: a-infinitive. a bhith a’ dannsadh

The five new ones need some explanation. S[neg] and S[negq] are motivated by the clear fourfold division of ordinary sentences into positive, interrogative, negative and interrogative negative. S[n], relating as it does to a verbal noun, replaces S[ed], S[pss] and S[ng] in the CCGbank scheme for English. S[a] is somewhat like S[to] in the CCGbank scheme but not exactly the same as it contains a verbal noun somewhere, and lastly S[dep] presents a phenomenon we simply don’t get in English.

So what do particles do here? Let’s take a few examples from last week’s An Litir Bheag:

Cha|S[neg]/S[dep]
robh|S[dep]/NP/S[adj]
an|NP/N
riaghaltas|N
toilichte|S[adj]
.|.

Here cha is a function mapping a dependent clause to a negative sentence.

Ach|S[dcl]/S[dcl]
tha|S[dcl]/NP/PP
Dòmhnall|NP/NP
MacRath|NP
air|PP/S[n]
a|NP
chuimhneachadh|S[n]\NP
ann|PP/NP
an|NP/NP
Leòdhas|NP
fhathast|S[dcl]\S[dcl]
,|,
agus|conj
gu|(S[dcl]\S[dcl])/S[adj]
dearbh|S[adj]
air|PP/NP
feadh|NP
na|(NP\NP)/(N\N)
Gàidhealtachd|N\N
,|,
airson|PP/NP
na|NP/S[dcl]
rinn|S[dcl]/NP
e|NP
às|PP/NP
leth|NP
nan|(NP\NP)/(N\N)
daoine|N\N
.|.

There is a lot going on there. I’ve thought of adverbs as taking a sentence in and giving you a sentence back. Hence gu when it serves to make an adverb out of an adjective, takes S[adj] as its argument and gives you a function that takes a S[dcl] and gives you back S[dcl]. na, as in “that which”, is a function that takes a S[dcl] and gives you a NP back. I’m using shorthands for conjunctions and PPs, but these are both described in the literature.

Potential point for discussion: I’ve treated aga’airgu and ri when they introduce verbal nouns as PP/S[n]. But maybe they should be a clause type of their own. Needs more thought.

Posted in grammar | Leave a comment

What the meaning of “is” is

This is the Scottish Gaelic is, often pronounced and written ‘s, not the English “is”. It’s a copula, and you can say things like Is mise Càilean, or ‘S math sin, but usually the constructions are more complicated than that and those two examples are

We have the clefted construction Is + e + NP + (for example) a tha + PP[ann] to equate the NP and the innards of the PP, where e is pretty much an expletive like a lot of uses of “it” and “there” in English.

There are “quirky” constructions where the object looks like a subject, and the subject is expressed with a PP. Is toil leam biadh innseanach and  Is toil leam a bhith a’ dannsadhare examples, where it is I that like Indian food and I like dancing. (Examples from Teach Yourself Gaelic, 2nd edn). My list so far of the words that can go in the toil slot, and what sort of PP they take, is this:

  • PP[le]: toil (n), toigh (adj), caomh (adj), fhèarr (adj), mhath (adj)
  • PP[air]: beag (adj), lugha (adj)
  • PP[do]: fhiach (adj), urrainn (n), chòir (n), aithne (n), àbhaist (n), mhiann (n)

I expect there are more! To the best of my (admittedly very limited) knowledge, a difference between Scottish and Irish Gaelic is that Irish Gaelic only takes adjectives in the toil slot. They are a bit various in what sort of clausal complements they take, which is a matter for another blog posting.

The other important construction with is is where it’s followed by ann in order to emphasize something that doesn’t normally go in that position, a bit like 把 in Chinese. This is very often a PP, for example from here: ‘s ann às an Fhraing is Ameireagaidh a tha ise “It is from France and America she is from”. I think ann here is really the fused PP for ann + e.

In summary:

  • Is + NP + NP (rare)
  • Is + ADJ + NP (also rare)
  • Is + N[toil]/ADJ[toil] + PP + SUBJ
  • Is + e + a BI + PP[ann]
  • Is + ann + PP/ADJ/ADV/NP[temporal] + a BI + PP[ann]

Have I missed any?

Posted in grammar | Leave a comment

Hope, expectation, responsibility

Even though bi is the verb for “to be”, you can’t usually use it with two noun phrases, certainly not to say that one of them is the order. But there is a class of nouns that go quite happily with another noun as arguments of bi. I think what might be going on is that they’re being used adverbially, like an diugh (today) or an làthair (present). Let’s take this phrase from the Scotsman (source) a few years ago (slightly edited because Johnston Press have mislaid their diacritics):

Thuirt am Ministear a tha an urra ris a’ Ghàidhlig, Peter Peacock:

“Said the minister responsible for Gaelic, Peter Peacock:” is what this means. It’s a clefted construction, as is so often the case in Gaelic and Irish. Tha am Ministear an urra ris a’ Ghàidhlig “The minister is responsible for Gaelic” would be the unclefted version.

Another example from the same piece:

Tha mi an dòchas gum bi duine làidir ann a sheasas suas riutha, a sheasas airson na Gàidhlig, airson nan Gàidheal ‘s an aghaidh an riaghaltais ma tha sin a dhìth.

“I hope that there will be strong people who will stand up for them, stand for Gaelic, for the Gaels and against the government if need be.” This is unclefted and clearer than the previous sentence. At the very beginning we have thami, and an dòchas gu… as the verb and two noun phrases.

And one from the BBC:

Chuir ministear eile aig Eaglais na h-Alba fios chun na h-eaglaise gu bheil e an dùil fàgail air sgàth cùis nam ministearan gèidh.

“Another minister in the Church of Scotland has sent word to the church that he expects to leave on account of the matter of gay clergy.” Here we have bheil, the dependent form of bi, followed by e, “he” and an dùil fàgail, “the expectation to leave”.

So that means that bi fits the following patterns (out of my head and double-checked with William Lamb’s Scottish Gaelic):

  1. bi + NP + PP: for expressing locations, for possession, for many verbal constructions if we take ag/a’ and friends to be prepositions (otherwise there is a 1b: bi + NP + AspP), and for linking two nouns: tha mi nam oileanach and ‘s e oileanach a th’annam
  2. bi + NP + ADJ: tha sinn toilichte, tha i brèagha and so on
  3. bi + NP + ADV[loc]: tha an cat a-staigh
  4. bi + NP + NP[dòchas]: the examples we’ve seen above and a few more. Wilson McLeod on Twitter has helpfully pointed out that dùil and urra (as shown above), eisimeil and crochadh are in this set of nouns.

I wonder whether there are any more? I will keep looking.

Posted in grammar | Leave a comment

horsey compounds

For example the adjectival suffix -each was interpreted as the noun each (an old word for ‘horse’), resulting a large number of unusual horsey compounds.

(from Elaine Uí Dhonnchadha’s PhD thesis, http://doras.dcu.ie/2349/)

Each is the normal Scottish Gaelic word for horse; Irish prefers capall, which in Scotland means “mare”, but has been borrowed into English. Capercaillie comes from capall-coille, or “mare of the woods”.

Posted in irish gaelic | Leave a comment

Dependency structures in Irish Gaelic

Quick note to say that Teresa Lynn at DCU has been working on a project based on dependency treebanks for Irish. This is relevant to this blog because Irish Gaelic is very closely related to Scottish Gaelic and much of the grammar is similar, and there has also been work in the past (Clark and Curran 2007, Table 2, for example) on deriving dependency structures from CCG lexical structures.

Here are two papers I’ve had a quick look at:

Posted in irish gaelic, other people's code | 5 Comments

Resources present and future

Excitingly, William Lamb at the University of Edinburgh tells me in the comments on this earlier post has been funded by the Bòrd na Gàidhlig to work on a tagset and corpus for Scottish Gaelic.

I have been delighted to be pointed to his 2003 Scottish Gaelic (2nd edn, Lincom Europa, Munich), which is exactly the sort of book I have been looking for. Worth careful study.

Posted in grammar | Leave a comment

Ambiguity everywhere

Much of the basic grammatical machinery of Gaelic consists of overloaded words. This is nothing unusual, of course; in English, for example, to is both a preposition and marks the infinitive, but there seems to be an awful lot of it going on in Gaelic. One of the more striking examples is an. This can be:

  • the definite article: an t-eilean
  • an interrogative particle: An do chòrd e riut?
  • the interrogative form of is: An toil leat ball-coise?
  • a possessive pronoun (their): an càr

Do has several meanings too:

  • a possessive pronoun (your): do bhaidhseagal 
  • a preposition: do Ghlaschu
  •  a past-tense marking particle: An do chòrd e riut?

has at least the following meanings and there may well be some I’ve missed:

  • numerical particle: a h-aon
  • vocative particle: a Mhàiri
  • the infinitive particle: an uinneag a dhùnadh
  • an interrogative particle: A bheil thu a’ dannsadh?
  • two possessive pronouns (her and his): a chàr, a h-athair
  • relative particle: Dè an t-ainm a tha ort?

not to mention its homophonous friend a’:

  • definite article: anns a’ chidsin
  • the participle particle: Tha mi a’ dol

If I want to start part-of-speech-tagging Gaelic text, as a preliminary to building a grammar, I’m going to need to write some guidelines as to when each of these words is what.

 

Posted in grammar | 2 Comments

It’s fine

This confused me, so I mention it in case it confuses anyone else.

If predicative adjectives have type S[adj]\NP (because they come after the noun), NPs have type NP and the predicative copula has type (S[dcl]/(S[adj]\NP))/NP, then how do we cope with sentences that only have one NP? Where I went astray was assuming that if you have a word of type X/Y, then there has to be a Y somewhere in that sentence.

Not true! Tha i brèagha  “it’s fine” (talking about the weather) is a good and simple example.

Tha i brèagha
V NP ADJ
(S[dcl]/(S[adj]\NP))/NP NP S[adj]\NP
S[dcl]/(S[adj]\NP) S[adj]\NP
S[dcl]

In this case, tha is of type (X/Y)/Z, and just forward composes with Z to its right and then Y next to the right. It just so happens that Y is a non-atomic type.

Now I’ve understood this I can worry about more complicated things.

Posted in grammar | Leave a comment