The immediate family members in Scottish Gaelic are?m?thair, athair, br?thair, all of which are clearly related to other familiar European languages, and?piuthar, “sister”, which looks odd. Irish is yet odder at first glance, with?dearth?ir meaning “brother” and?deirfi?r meaning “sister”.

I’ve been reading David Stifter’s Sengoidelc, a readable and reassuring text about Old Irish, the written Irish of the 8th and 9th centuries, which contains?at least part of the?explanation. It turns out that in Old Irish there were two letters?s. One of them lenited by turning into an?h, a bit like?s?in Gaelic becoming?sh pronounced /h/, but the other one turned into an?f, and the main word that began with that sort of an?s was siur, meaning “sister”.

What seems to have happened in Scotland is that the nominative case form was back-derived from the?lenited form phiur?and assumed to be piur. Conversely in Ireland the nominative form won out, and they say?si?r, but mainly, I think,?for non-biological sisters, like nurses and nuns. A further difference here: Scotland retains the disyllabic form, whereas in Ireland it’s been simplified to a long vowel.

But why in Ireland do they say?dearth?ir and?deirfi?r for your biological siblings? Enter eDIL, the Electronic Dictionary of the Irish Language, which has entries for derbr?thair and derbsiur, “true brother” and “true sister” respectively.


(1) D?reach aona m?os deug roimhe sin…

“Just eleven months before that”. In my annotation guidelines I have blithely stated “Attributive numbers are N/N“, which is fine for aona, but less so for deug, which I am going to treat as N\N. And yet in tr? deug m?le it seems fair enough.

(2) Bha G?idhlig ga bruidhinn air feadh Alba anns an aona linn deug.

(3) Bha sin ann an naoi ceud deug, fichead ‘s a ceithir.

Years and centuries are interesting. In (2), anns an aona … deug means “in the tenth”, as opposed to the other examples where deug means “ten”. In (3) the heads look like ceud, fichead and ceithir, so each of these can be N too.

Different rules apply, however, for the personal numbers: aonar, dithis, tri?ir and so on because if they are not standing on their own, they are followed by a noun in the genitive, for example dithis chloinne (“two children”) where dithis is N and chloinne is N\N.

Resumptivity resumed

I said?(four years ago) that Gaelic doesn’t have resumptive pronouns. However, while scouring William Lamb’s?Scottish Gaelic for unusual uses of?agus, I found these examples, with the resumptive bit in bold.

  • sin an gille a shuidh C?it air (that is the boy who Kate sat upon) (do not try this at home)
  • sin an gille a tha a mh?thair bochd (that is the boy whose mother is ill)

Now, in dictionaries?air?in the first example is indeed treated as a pronoun, though for subcategorization purposes I prefer to treat it as a PP. The second case, a?as possessive pronoun, I’ve been treating as a pronoun, so on my own account what I said about Gaelic was wrong. It may of course be a determiner. The evidence for this off the top of my head is that unlike the small class of prenominal adjectives deagh, droch, s?r and so on,?the possessives?mo, do, a and so on can’t co-occur with the article?an or with?gach, and that unlike nouns in the genitive they go before the possessor rather than after the possessor. Pronoun or determiner, they have type N/N in categorial grammar.

Apparently there are resumptive pronouns in Irish, but I don’t have enough Irish to make sense of the literature I’ve seen on the subject, so I shall stop here.

Interrogative frequencies in DASG

One aspect of Gaelic I want to look at more closely is interrogatives. Just as all the wh- words in English (who, when, why, what, how) go to the front of the sentence, so do all the c- words in Gaelic and the word order in the rest of the sentence changes as well. This is not universal, however. In Chinese, one simply substitutes the word for ‘what’ in the ordinary sentence order, just as when we’re particularly surprised in English we might say “You ate what?”.

In order to see how they work exactly, we need example sentences, so I’ve been looking in?DASG. One easy first step is to look at frequencies in this table:

Interrogative Count English Observations
c? 9122 who noisy; lots of prefixes and parts of words
ciod 4587 what ?
cia 2363 how also cia mar?in older texts, cia fhad ‘how long’,?cia mh?r ‘how big’
d? 403 what also ‘God’
ciamar 273 how ?
c?it 182 where also genitive of cat meaning ‘cat’
carson 133 why ?
c?ite 90 where ?
cuin 59 when ?
cuine 15 when ?

These are the results of accent-insensitive searches as the older texts haven’t had their spelling modernized or made consistent. The results surprised me a great deal for a number of reasons. Firstly,?ciod?’what’, which I don’t recall seeing terribly often in the present day is the most numerous interrogative, mostly occurring in a single document, a history of Scotland. One of the very first words you learn in Gaelic is its modern counterpart?d?, which only has about 200 (judged by eye) instances as an interrogative in DASG. This is a similar number to?c?it(e), carson,?cuin(e), and?ciamar, ‘where’, ‘why’, ‘what’ and ‘how’. Secondly, the enormous number of hits for?cia?‘how’, which on a cursory inspection are often?exclamations, ‘how swift’, ‘how long’, ‘how horrible’ or an old spelling of?ciamar in addition to the more familiar?cia mheud ‘how many’.?Thirdly, nearly all of the instances of?d? meaning ‘what’ are from a single work,?Saoghal Bana-mharaiche, describing the Gaelic from the coast of Easter Ross.

I’ll leave you with a new meaning I’d never seen before for gu. This can be gu the?preposition, gu the subordinator (as in?gu bheil),?gu the aspect marker?or gu?the adverbializer, but?Gu d? tha thu? from DASG31,?Ugam agus bhuam, is clearly neither. As explained here, what is going on is this: the Gaelic for ‘what’ used to be?ciod e, like the Irish?cad ?, and over time this became?d?. Gu d? is a variant of this. It’s another one of those pesky multiword expressions.

[Edit 2015-01-03 to clarify reason for looking at interrogatives and add another meaning of?gu.]

DASG and the second comparative

If you haven’t come across?Dachaigh airson St?ras na G?idhlig/Digital Archive of Scottish Gaelic you should stop what reading this and go straight there.

Welcome back. It contains eight and a half million words and is a resource I keep coming back to. In my first investigation, I’m looking for the second comparative, which I had never seen before last weekend. Here’s an example:

Is feairrde na stamagan srubag dheth

(The stomachs are better for a wee drink in them.) It’s explained in Gillie’s?Elements of Scottish Gaelic Grammar, as differing from the normal comparative (“Xer”) in that it means “Xer by that” or “Xer because of that”. If you search for a word, DASG gives you a concordance so you can look at the local context of words.

Some second comparatives in DASG: feairrd, feairrde, misd, bigid, lughaid. An ambiguous word that might be a second comparative:?m?id. I look forward to a POS-tagged version of DASG.

Training a dependency parser on gdbank

A very quick note to say that I’ve trained maltparser, a dependency parser, with?the current gdbank sentences (a mere 1223 tokens spread across 70-odd sentences), the Universal POS tagging scheme and the current Universal-ish gdbank dependency annotation scheme, and then seen how it performed on an unseen test set of 8 sentences containing 276 tokens taken from an article in The Scotsman from a few years ago.

It got 196 (71%) of the heads right, 207 (75%) of the dependency types right, and both the head and the dependency were right in 187 (68%) of cases. My initial impressions is that the main problems are subordinators and my having mis-POS-tagged a few words, but there will be a confusion matrix soon.

MaltParser cheat mode

If you train MaltParser using the learnwo flowchart in place of learn, it does all the same things, except that it writes out the sentences as it reads them in.

This means that if you have, ahem, misformatted any of your input, you can see exactly which misformatting MaltParser is complaining about, because it will be in the first sentence that hasn’t been written to stdout.

Installing MaltParser on Mac OS X 10.6.8

MaltParser is a dependency parser and it’s available here: http://www.maltparser.org/download.html

If you try to run the ready-built jar under Mac OS X 10.6.8 and you haven’t updated to Java 1.7, you’ll get a major.minor version number error. However, if you simply edit references in the build.xml file to read 1.6, and type

ant dist

to build with ant, then it will whirr away for a bit and build fine.