Unless I indicate otherwise, all these examples are taken from Gareth King’s Intermediate Welsh (London: Routledge, 1996). The analyses are mine, as are the errors.

I don’t think I ever mastered the word mai, and reading up on it, I think it’s because I never mastered changes of word order. The verb doesn’t have to go first in the sentence. Take the title of Menna Elfyn’s Ibsen translation Y Fenyw Ddaeth o’r Môr, where the NP, ‘the woman’, comes before the dependent form of the verb, ddaeth not daeth. The opening stage directions have lots of PPs before the independent form of the verb, like this:

  • Ar y chwith mae feranda dan do llydan. ‘On the left there is a veranda under a broad roof.’
  • Yn y tu blaen, ac o gwmpas y tŷ, mae gardd. ‘In front, around the house, is a garden.’
  • Islaw’r feranda, mae polyn baner. ‘Below the veranda, there is a flagpole’.

and so on. Now ordinary subordinate clauses, which I did get the hang of, look like this:

Dw i'n meddwl          fod                  Ron yn dod yfory
------ --- --- -------------
S[n]/NP/S[sub] S[sub]/S[asp]/NP/NP NP S[asp]/NP

which is the same as a declarative clause, except it can be an argument to meddwl or credu or another verb of thinking, feeling and so on. But what if we’re emphasizing Ron? Then we have the word mai before Ron before the dependent form of mae, which in this case is sy. So how do we handle this? There is a back door in CCG which is the unary type-changing rule. It’s not the done thing, but if I gather examples of them hopefully someone who understands these things better can refactor the grammar into a cleverer shape. Here are three type-changing rules, which add a feature FRONTED:

  • S[dep]/NP → S[dcl, +FRONTED]\NP (blocked for mae)
  • S[dep]/NP → S[dcl, +FRONTED]\S[n]/NP (not blocked for mae). Example: Gwaethygu mae’r sefyllfa yn Ne Ewrop.
  • PP → S[+FRONTED]/S. Example: Menna Elfyn’s scene setting above.

The idea here is that mai (and its South Walian counterpart taw) has the type S[sub]/S[dcl, +FRONTED], which is to say that it only takes a declarative clause if there’s something in front of the verb.

That feels as if I’ve learnt something.

Posted in welsh | Leave a comment

Every one’s a clitic: a general treatment of one family of fused words in Welsh

I’ve been starting to look at Welsh through the lens of CCG, largely because if I did manage to learn how to use words like mai, sydd, sef and bod (as a conjunction) correctly in my youth I have forgotten now.

I have to know what’s going on in the simpler clauses that these words are joining together first, though. So far the analysis from Scottish Gaelic, for example, word order, verbal nouns being clauses of type S[n]/NP/NP or S[n]/NP and particles like yn or wedi being type-changers, carries through, partly because I made sure I read up on how people have treated the verbal noun in Welsh beforehand. However the example sentences I’ve been looking at have pronouns attached to clitic particles, hi’n, to articles, e’r and to possessive pronouns, fe’ch.

This needn’t be a problem for dependency grammars, where you can have as many edges coming out of a single node as you like, but it looks tricky for constituency parsers where you expect the sentence to be of the form VP NP, but part of the fused word is in the VP and part of it is in the NP. At this stage it would be very easy to decide to change the tokenization rules so that e and ‘r are separate words, but one thing CCG is good at is assigning categories, possibly baroque and frightening ones to words that reflect what the words do in a sentence.

Let’s take Rydyn nhw’n dod ‘they are coming’. dod is an intransitive verbal noun which I take to be S[n]/NP. Rydyn is the independent verb ‘to be’, present tense, third person, and expects an NP for the subject and either an adjectival phrase or an aspectual phrase. I’ve written this as S[dcl]/S[asp]/NP/NP. On their own, nhw ‘they’ and yn (aspect marker) are NP and S[asp]/NP/S[n]/NP respectively. But what are they when combined? The way to answer this is to treat parsing the sentence as a mathematical puzzle. We know the solution is S[dcl], and at each stage of the proof we are allowed one of the allowed moves in CCG, application, substitution, type-raising or composition, and then we solve for Q in the below. I had a hunch that backwards crossed composition combined with type-raising would be the way to go here. Let’s try type-raising dod first. We want a backslash so we can try backwards crossed composition, Y/Z X\Y -> X/Z

Rydyn               nhw'n              dod
S[dcl]/S[asp]/NP/NP Q S[n]/NP
(try D = S[asp]/NP)

So, X = S[asp]/NP and Y = S[asp]/NP/S[n]/NP. Q = Y/Z. We know that X/Z = S[asp]/NP/NP, so…

S[asp]/NP/S[n]/NP/NP S[asp]/NP\S[asp]/NP/S[n]/NP
S[dcl] ∎

The first thing I want to observe is that this would be clearer if everything were coloured in. The second thing is that the type of nhw’n, if you take the type of nhw to be A and ‘n to be B, is B/A. This feels like the sort of result that is obvious to someone who is more proficient than me. But is it generalizable? Let’s try the simpler construction in Gwerthodd e’r oergell – ‘he sold the fridge’. Here e is NP and ‘r, the article is NP/N, and oergell, an indefinite fridge, is N, so A = NP, B = NP/N and B/A = NP/N/NP.

Gwerthodd      e'r     oergell
S[dcl]/NP/NP NP/N/NP N
try type-raising with NP
Y = NP/N, X = NP, Y = NP
S[dcl] ∎

I think that’s a result. Next up: look into Lambek’s product operator and sort out what’s going on with the eich… chi construction.

Posted in welsh | Leave a comment

Geàrr Ghràmar na Gàidhlig

Tha mi air a bhith a’ leughadh Geàrr Ghràmar na Gàidhlig le Richard A. V. Cox. Tha e glè dhlùth, mhionaideach is 492 duilleagan a dh’fhaide is e anns a’ Ghàidhlig air fad. Mar sin tha sanas bhriathar ann is tha na teirmichean teicnigeach nas soilleire na anns a’ Bheurla. Dè tha apocope, syncope is aphaeresis a’ ciallachadh? Teasgadh deiridh, teasgadh meadhain is teasgadh toisich.

I have been reading Richard A. V. Cox’s Geàrr Ghràmar na Gàidhlig (‘Short Grammar of Gaelic’). It’s very dense, very detailed and 492 pages long, not to mention entirely in Gaelic. To this end there is a glossary of the technical vocabulary, which is generally easier to work out than the corresponding vocabulary in English: apocope, syncope and aphaeresis are teasgadh deiridh, teasgadh meadhain and teasgadh toisich.

Posted in grammar | Leave a comment


The immediate family members in Scottish Gaelic are màthair, athair, bràthair, all of which are clearly related to other familiar European languages, and piuthar, “sister”, which looks odd. Irish is yet odder at first glance, with deartháir meaning “brother” and deirfiúr meaning “sister”.

I’ve been reading David Stifter’s Sengoidelc, a readable and reassuring text about Old Irish, the written Irish of the 8th and 9th centuries, which contains at least part of the explanation. It turns out that in Old Irish there were two letters s. One of them lenited by turning into an h, a bit like in Gaelic becoming sh pronounced /h/, but the other one turned into an f, and the main word that began with that sort of an s was siur, meaning “sister”.

What seems to have happened in Scotland is that the nominative case form was back-derived from the lenited form phiur and assumed to be piur. Conversely in Ireland the nominative form won out, and they say siúr, but mainly, I think, for non-biological sisters, like nurses and nuns. A further difference here: Scotland retains the disyllabic form, whereas in Ireland it’s been simplified to a long vowel.

But why in Ireland do they say deartháir and deirfiúr for your biological siblings? Enter eDIL, the Electronic Dictionary of the Irish Language, which has entries for derbráthair and derbsiur, “true brother” and “true sister” respectively.

Posted in Uncategorized | Leave a comment

Second Celtic Language Technology Workshop revised deadline April the 20th

… which is next Wednesday rather than this Friday. Or if you’re in the UK or Ireland it’s very early next Thursday, but clearly nobody reading this would leave submission till the last moment. No.

Posted in conferences | Leave a comment


(1) Dìreach aona mìos deug roimhe sin…

“Just eleven months before that”. In my annotation guidelines I have blithely stated “Attributive numbers are N/N“, which is fine for aona, but less so for deug, which I am going to treat as N\N. And yet in trì deug mìle it seems fair enough.

(2) Bha Gàidhlig ga bruidhinn air feadh Alba anns an aona linn deug.

(3) Bha sin ann an naoi ceud deug, fichead ‘s a ceithir.

Years and centuries are interesting. In (2), anns an aona … deug means “in the tenth”, as opposed to the other examples where deug means “ten”. In (3) the heads look like ceud, fichead and ceithir, so each of these can be N too.

Different rules apply, however, for the personal numbers: aonar, dithis, triùir and so on because if they are not standing on their own, they are followed by a noun in the genitive, for example dithis chloinne (“two children”) where dithis is N and chloinne is N\N.

Posted in grammar | Leave a comment

Second Celtic Language Technology Workshop deadline April the 15th

I have partly been quiet here because I have been hard at work putting together something for this:
and clearly I should not prejudice the double-blindness of the refereeing too much. Ahem.

Posted in conferences | Leave a comment

Resumptivity resumed

I said (four years ago) that Gaelic doesn’t have resumptive pronouns. However, while scouring William Lamb’s Scottish Gaelic for unusual uses of agus, I found these examples, with the resumptive bit in bold.

  • sin an gille a shuidh Cèit air (that is the boy who Kate sat upon) (do not try this at home)
  • sin an gille a tha a mhàthair bochd (that is the boy whose mother is ill)

Now, in dictionaries air in the first example is indeed treated as a pronoun, though for subcategorization purposes I prefer to treat it as a PP. The second case, a as possessive pronoun, I’ve been treating as a pronoun, so on my own account what I said about Gaelic was wrong. It may of course be a determiner. The evidence for this off the top of my head is that unlike the small class of prenominal adjectives deagh, droch, sàr and so on, the possessives mo, do, a and so on can’t co-occur with the article an or with gach, and that unlike nouns in the genitive they go before the possessor rather than after the possessor. Pronoun or determiner, they have type N/N in categorial grammar.

Apparently there are resumptive pronouns in Irish, but I don’t have enough Irish to make sense of the literature I’ve seen on the subject, so I shall stop here.

Posted in grammar | 2 Comments

Interrogative frequencies in DASG

One aspect of Gaelic I want to look at more closely is interrogatives. Just as all the wh- words in English (who, when, why, what, how) go to the front of the sentence, so do all the c- words in Gaelic and the word order in the rest of the sentence changes as well. This is not universal, however. In Chinese, one simply substitutes the word for ‘what’ in the ordinary sentence order, just as when we’re particularly surprised in English we might say “You ate what?”.

In order to see how they work exactly, we need example sentences, so I’ve been looking in DASG. One easy first step is to look at frequencies in this table:

Interrogative Count English Observations
9122 who noisy; lots of prefixes and parts of words
ciod 4587 what
cia 2363 how also cia mar in older texts, cia fhad ‘how long’, cia mhòr ‘how big’
403 what also ‘God’
ciamar 273 how
càit 182 where also genitive of cat meaning ‘cat’
carson 133 why
càite 90 where
cuin 59 when
cuine 15 when

These are the results of accent-insensitive searches as the older texts haven’t had their spelling modernized or made consistent. The results surprised me a great deal for a number of reasons. Firstly, ciod ‘what’, which I don’t recall seeing terribly often in the present day is the most numerous interrogative, mostly occurring in a single document, a history of Scotland. One of the very first words you learn in Gaelic is its modern counterpart , which only has about 200 (judged by eye) instances as an interrogative in DASG. This is a similar number to càit(e), carson, cuin(e), and ciamar, ‘where’, ‘why’, ‘what’ and ‘how’. Secondly, the enormous number of hits for cia ‘how’, which on a cursory inspection are often exclamations, ‘how swift’, ‘how long’, ‘how horrible’ or an old spelling of ciamar in addition to the more familiar cia mheud ‘how many’. Thirdly, nearly all of the instances of  meaning ‘what’ are from a single work, Saoghal Bana-mharaiche, describing the Gaelic from the coast of Easter Ross.

I’ll leave you with a new meaning I’d never seen before for gu. This can be gu the preposition, gu the subordinator (as in gu bheil), gu the aspect marker or gu the adverbializer, but Gu dè tha thu? from DASG31, Ugam agus bhuam, is clearly neither. As explained here, what is going on is this: the Gaelic for ‘what’ used to be ciod e, like the Irish cad é, and over time this became dè. Gu dè is a variant of this. It’s another one of those pesky multiword expressions.

[Edit 2015-01-03 to clarify reason for looking at interrogatives and add another meaning of gu.]

Posted in grammar, preliminaries | Leave a comment

DASG and the second comparative

If you haven’t come across Dachaigh airson Stòras na Gàidhlig/Digital Archive of Scottish Gaelic you should stop what reading this and go straight there.

Welcome back. It contains eight and a half million words and is a resource I keep coming back to. In my first investigation, I’m looking for the second comparative, which I had never seen before last weekend. Here’s an example:

Is feairrde na stamagan srubag dheth

(The stomachs are better for a wee drink in them.) It’s explained in Gillie’s Elements of Scottish Gaelic Grammar, as differing from the normal comparative (“Xer”) in that it means “Xer by that” or “Xer because of that”. If you search for a word, DASG gives you a concordance so you can look at the local context of words.

Some second comparatives in DASG: feairrd, feairrde, misd, bigid, lughaid. An ambiguous word that might be a second comparative: mòid. I look forward to a POS-tagged version of DASG.

Posted in grammar | 2 Comments