Here they are, along with their counts in Wikipedia (as of the 25th of May 2011), their parts of speech (this is slightly fluid as I work out the tagset for the difficult ones), and a gloss in English where that’s doable.
01 an 42853 DT, COP, PRON, PARTICLE "the", "Is the?", "their", "?"
02 a 27516 REL, INFINITIVIZER, PRON "that", "to", "his", "her"
03 na 21328 DT "the", "of the"
04 ann 17545 P "in"
05 tha 16505 COP "am","are", "is"
06 e 16141 COP "he", "is"
07 a' 15155 AG (does not translate)
08 agus 14273 CONJ "and"
09 air 13113 P3S/P "on"
10 am 7038 DT "the"
11 's 6717 COP
12 anns 6091 P "in"
13 bha 5385 COP "was"
14 is 5022 COP/CONJ "and"
15 gu 4715 P/COMPLEMENTIZER "to"
16 aig 4317 P "on"
17 le 4209 P "with"
18 de 4121 P "of"
19 mar 3597 CONJ "if"
20 seo 2999 PRON "this"
21 sin 2848 PRON "this"
22 ri 2834 P "to"
23 nan 2716 DT "of the"
24 as 2682 P or COMPARATIVE MARKER "from"
25 baile 2648 N "town"
26 chaidh 2456 V "went"
27 ach 2310 CONJ "but"
28 iad 2242 PRON "they"
29 airson 2158 P "for"
30 do 2005 P/PRON/PAST TENSE PARTICLE "to"
31 bho 1939 P "from"
32 i 1795 PRON "she"
33 a-mach 1792 ADV "out"
34 san 1791 P+DT "in the"
35 daoine 1781 N "people"
36 eadar 1772 P "between"
37 b?saichean 1741 N "the dead"
38 neo 1636 CONJ "or"
39 tachartasan 1634 N "events"
40 h-alba 1604 N "Scotland"
41 br?ithean 1582 N "judgements"
42 mu 1550 P "about"
43 linn 1549 N "century"
44 leis 1538 P3S or P "with it" or "with" before article
45 bhaile 1484 N "town"
46 no 1472 CONJ "nor"
47 ceanglaichean 1380 N "links"
48 den 1375 P+DT "of the"
49 eile 1371 JJ "other"
50 dhe 1355 P "off"
51 bheil 1339 COP "was"
52 suidhichte 1303 JJ "arranged"
53 sa 1297 FUSEDPREPANDART "in the"
54 gun 1255 PREP "without"
55 ris 1249 PP3S or P "to him" or "to" before article
56 aige 1237 PP3S "on him"
57 cuideachd 1231 ADV "also"
58 robh 1219 COP "was"
59 iomraidhean 1194 N "the famous"
60 tuath 1192 N "north"
61 fuireach 1187 N "stay"
62 d?thcha 1149 N "of the country"
63 aonaichte 1099 JJ "united"
64 taobh 1072 N "to like"
65 duais 1048 N "prize"
66 nam 1029 DT "of the"
67 motha 1022 JJR "larger"
68 roinn 1016 N "region"
69 ?s 1006 P "from"
70 nuair 1004 CONJ "when"
71 iar 997 N "east"
72 far 968 CONJ "where"
73 tachartan 967 N "events"
74 eil 961 COP "was"
75 aon 948 NUM "one"
76 duine 946 N "person"
77 bhith 936 COP
78 eilean 921 N "island"
79 fh?in 920 PRON "oneself"
80 alba 917 N "Scotland"
81 st?itean 905 N "states"
82 breithean 887
83 deas 885 N "south"
84 bhliadhna 880 N "year"
85 chan 872 NEGP/COP "not"
86 m?r 871 JJ "big"
87 d?thaich 869 N "country"
88 ainm 820 N "name"
89 th' 813 COP "am", "are", "is"
90 d? 805 WH "what"
91 gach 805 JJ "every"
92 pr?omh-bhaile 802 N "capital"
93 ag 785 AG
94 nach 771 COP?"is not"
95 ? 764 PRON "he"
96 ainmeil 764 JJ "famous"
97 bhon 761 P+DT "from the"
98 b' 751 COP "was"
99 nas 741 COMPARATIVE MARKER "the most"
100 cho 732 P "as"
Source for the bits I didn’t know St?r-d?ta. Source for the mistakes Colin Batchelor.
One thought on “The 100 top word tokens in Gaelic”