From Fedora Project Wiki
No edit summary |
m (internal link cleaning) |
||
(8 intermediate revisions by one other user not shown) | |||
Line 121: | Line 121: | ||
| ig || Igbo || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || [http://www.dictionary.kasahorow.com/en/all/ig www.dictionary.kasahorow.com] | | ig || Igbo || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || [http://www.dictionary.kasahorow.com/en/all/ig www.dictionary.kasahorow.com] | ||
|- | |- | ||
| ik || Inupiaq || Broken [http://www.alaskool.org/language/inupiaqpb/In_spellchecker.html download link] to MSWord dictionary || || || [http://giellatekno.uit.no/ipk.html Iñupiaq parser project] | | ik || Inupiaq || Broken [http://www.alaskool.org/language/inupiaqpb/In_spellchecker.html download link] to MSWord dictionary || || || [http://giellatekno.uit.no/ipk.html Iñupiaq parser project], [http://siuc01.si.ehu.es/~jipsagak/SALTMIL2010_Proceedings.pdf Finite-State Morphology for Iñupiaq] | ||
|- | |- | ||
| is || Icelandic || hunspell-is || hyphen-is || || | | is || Icelandic || hunspell-is || hyphen-is || || | ||
Line 151: | Line 151: | ||
| ku || Kurdish (Arabic) || || || || [http://www.mail-archive.com/dev@native-lang.openoffice.org/msg02819.html some info] | | ku || Kurdish (Arabic) || || || || [http://www.mail-archive.com/dev@native-lang.openoffice.org/msg02819.html some info] | ||
|- | |- | ||
| kw || Cornish || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || [ | | kw || Cornish || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || [[Cornish|Fedora Cornish Language Translation Project]] | ||
|- | |- | ||
| ky || Kirgyz || hunspell-ky || || || [http://www.mail-archive.com/dev@l10n.openoffice.org/msg04360.html OOo localization beginnings]. Orthography [http://enews.ferghana.ru/article.php?id=168 news] | | ky || Kirgyz || hunspell-ky || || || [http://www.mail-archive.com/dev@l10n.openoffice.org/msg04360.html OOo localization beginnings]. Orthography [http://enews.ferghana.ru/article.php?id=168 news] | ||
Line 228: | Line 228: | ||
| sc || Sardinian || hunspell-sc || || || [http://qa.openoffice.org/issues/show_bug.cgi?id=107288 intended dictionaries] [https://launchpad.net/ditzionariusardu launchpad page | | sc || Sardinian || hunspell-sc || || || [http://qa.openoffice.org/issues/show_bug.cgi?id=107288 intended dictionaries] [https://launchpad.net/ditzionariusardu launchpad page | ||
|- | |- | ||
| sd || Sindhi | | sd || Sindhi || [http://extensions.services.openoffice.org/de/project/sindhispellchecker available] || || || | ||
|- | |- | ||
| se || Sammi, Northern || hunspell-se || [http://www.divvun.no/doc/proof/hyph/OOo/index.html watch this space] || || | | se || Sammi, Northern || hunspell-se || [http://www.divvun.no/doc/proof/hyph/OOo/index.html watch this space] || || | ||
Line 339: | Line 339: | ||
|- | |- | ||
| gug || Guarani || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || | | gug || Guarani || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || | ||
|- | |||
| haw || Hawaiian || hunspell-haw || || || | |||
|- | |- | ||
| hil || Hiligaynon || hunspell-hil || || || | | hil || Hiligaynon || hunspell-hil || || || | ||
Line 354: | Line 356: | ||
| ln || Lingala || hunspell-ln || || || | | ln || Lingala || hunspell-ln || || || | ||
|- | |- | ||
| ltg || Latgalian || | | ltg || Latgalian || [http://dict.dv.lv/download.php?prj=la available] || || || [http://www.ante.lv/vuordneica/bin/view/Main/ Latgalian resources] | ||
|- | |- | ||
| mos || Mossi || hunspell-mos || || || [http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Moore info]. [http://markmail.org/message/wya3mihuqmmqjxle dictionary effort] (hunspell has no problem with utf-8 .dic files FWIW) | | mos || Mossi || hunspell-mos || || || [http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Moore info]. [http://markmail.org/message/wya3mihuqmmqjxle dictionary effort] (hunspell has no problem with utf-8 .dic files FWIW) | ||
Line 399: | Line 401: | ||
|- | |- | ||
| swb || Maore || || || || [http://www.ethnologue.com/show_language.asp?code=swb swb information] | | swb || Maore || || || || [http://www.ethnologue.com/show_language.asp?code=swb swb information] | ||
|- | |||
| tet || Tetum || hunspell-tet || || || | | tet || Tetum || hunspell-tet || || || | ||
|- | |- | ||
| tpi || Tok Pisin || | | tpi || Tok Pisin || hunspell-tpi || || || | ||
|- | |- | ||
| ty || Tahitian || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || | | ty || Tahitian || [http://borel.slu.edu/crubadan/apps.html crubadan] corpus building || || || |
Latest revision as of 21:13, 19 September 2016
Linguistic Components
1. Language Support Matrix (glibc upwards)
Language Code | Language | hunspell | hyphen | mythes | notes |
aa | Afar | afarfriends.org hosted ALSEC report. | |||
af | Afrikaans | hunspell-af | hyphen-af | ||
am | Amharic | hunspell-am | |||
an | Aragonese | www.iea.es, see Spain: Lexicography In Iberian Languages | |||
ar | Arabic | hunspell-ar | experimental thesaurus | ||
as | Assamese | hunspell-as | hyphen-as | xobdo is another potential source, possibly even for a thesaurus, but this isn't an option apparently at the moment. | |
ast | Asturian | hunspell-ast | dictionary announcement | ||
az | Azeri (Latin) | hunspell-az | |||
be | Belarusian | hunspell-be | hyphen-be | ||
ber | Amazigh (Tifinagh) | hunspell-ber | |||
ber | Amazigh (Latin) | ||||
bg | Bulgarian | hunspell-bg | hyphen-bg | mythes-bg | |
bn | Bengali | hunspell-bn | hyphen-bn | ||
bo | Tibetan | bo.openoffice.org. Latest language support update. | |||
br | Breton | hunspell-br | |||
bs | Bosnian | hunspell-bs | hyphen-bs | ||
byn | Blin | Blin Orthography: A History and an Assessment | |||
ca | Catalan | hunspell-ca | hyphen-ca | mythes-ca | |
crh | Crimean Tatar | A corpus | translation team | ||
cs | Czech | hunspell-cs | hyphen-cs | mythes-cs | |
csb | Kashubian | hunspell-csb | |||
cv | Chuvash | hunspell-cv | |||
cy | Welsh | hunspell-cy | hyphen-cy | ||
da | Danish | hunspell-da | hyphen-da | mythes-da | |
de | German | hunspell-de | hyphen-de | mythes-de | |
dv | Dhivehi | ||||
dz | Dzongkha | crubadan corpus building | Some requests for help/info. | ||
el | Greek | hunspell-el | hyphen-el | mythes-el | |
en | English | hunspell-en | hyphen-en | mythes-en | |
es | Spanish | hunspell-es | hyphen-es | mythes-es | |
et | Estonian | hunspell-ee | hyphen-et | ||
eu | Basque | hunspell-eu | hyphen-eu | ||
fa | Farsi | hunspell-fa | hyphen-fa | ||
fi | Finnish | Finnish Community has a parallel Voikko solution. With an enchant backend, an OpenOffice.org extension, and a Firefox extension. | |||
fil | Filipino | hunspell-tl | Filipino is effectively an official Tagalog-based language | ||
fo | Faeroese | hunspell-fo | hyphen-fo | ||
fr | French | hunspell-fr | hyphen-fr | mythes-fr | |
fur | Friulian | hunspell-fur | |||
fy | Frisian | hunspell-fy | |||
ga | Irish | hunspell-ga | hyphen-ga | mythes-ga | |
gd | Scots Gaelic | hunspell-gd | |||
gez | Ge'ez | Ge'ez Frontier Foundation | |||
gl | Galician | hunspell-gl | hyphen-gl | ||
gu | Gujarati | hunspell-gu | hyphen-gu | ||
gv | Manx | hunspell-gv | |||
ha | Hausa | available but no License mentioned. In private communication " We will specify licenses for the next release of the spell checkers. In the meantime, assume both Hausa and Eʋegbe have the GNU GPLv3 license as well." | |||
he | Hebrew | hunspell-he | info on hyphenation | ||
hi | Hindi | hunspell-hi | hyphen-hi | Hindi Wordnet is likely convertible, claims to have similar format as English Wordnet, which is the basis of mythes-en | |
hne | Chhattisgarhi | corpus building | |||
hr | Croatian | hunspell-hr | hyphen-hr | This hasn't been updated in a number of years, on a purely orthographical basis I wonder if dict-sr would provide a better option | |
hsb | Upper Sorbian | hunspell-hsb | hyphen-hsb | ||
ht | Haitian Creole | hunspell-ht | |||
hu | Hungarian | hunspell-hu | hyphen-hu | mythes-hu | |
hy | Armenian | hunspell-hy | |||
id | Indonesian | hunspell-id | hyphen-id | ||
ig | Igbo | crubadan corpus building | www.dictionary.kasahorow.com | ||
ik | Inupiaq | Broken download link to MSWord dictionary | Iñupiaq parser project, Finite-State Morphology for Iñupiaq | ||
is | Icelandic | hunspell-is | hyphen-is | ||
it | Italian | hunspell-it | hyphen-it | mythes-it | |
iu | Inuktitut | www.livingdictionary.com | |||
ja | Japanese | ||||
ka | Georgian | Crubadan is aware of 29023 words | ka.openoffice.org Some info on spellchecking the language. | ||
kk | Kazakh | hunspell-kk | |||
kl | Kalaallisut | Greenlandic parser project. MSWord checker. | |||
km | Khmer | hunspell-km | |||
kn | Kannada | hunspell-kn | hyphen-kn | ||
ko | Korean | hunspell-ko | |||
kok | Konkani | [http://www.savemylanguage.org/ online dictionary | |||
ks | Kashmiri | online dictionary | |||
ku | Kurdish (Latin) | hunspell-ku | hyphen-ku | ||
ku | Kurdish (Arabic) | some info | |||
kw | Cornish | crubadan corpus building | Fedora Cornish Language Translation Project | ||
ky | Kirgyz | hunspell-ky | OOo localization beginnings. Orthography news | ||
lg | Luganda | crubadan corpus building | A general translation effort. An online dictionary | ||
li | Limburgish | crubadan corpus building | |||
lo | Lao | Lao OOo localization | |||
lt | Lithuanian | hunspell-lt | hyphen-lt | ||
lv | Latvian | hunspell-lv | hyphen-lv | mythes-lv | |
mai | Maithili | hunspell-mai | maithiliacademy.org | ||
mg | Malagasy | hunspell-mg | mg is equivalent to mlg which is a macrolanguage, see plt for "Standard Malagasy | ||
mi | Maori | hunspell-mi | hyphen-mi | mythes-mi | |
mk | Macedonian | hunspell-mk | convertible | ||
ml | Malayalam | hunspell-ml | hyphen-ml | ||
mn | Mongolian | hunspell-mn | hyphen-mn | ||
mr | Marathi | hunspell-mr | hyphen-mr | ||
ms | Malay | hunspell-ms | no content, but a project announcement for Malaysian thesaurus etc. | ||
mt | Maltese | hunspell-mt | |||
my | Burmese | online dictionary | |||
nan | Min Nan | online dictionary?
Debian wiki notes | |||
nb | Bokmaal | hunspell-nb | hyphen-nb | mythes-nb | |
nds | Lowlands Saxon | hunspell-nds | |||
ne | Nepali | hunspell-ne | mythes-ne | ||
nl | Dutch | hunspell-nl | hyphen-nl | mythes-nl | |
nn | Nynorsk | hunspell-nn | hyphen-nn | mythes-nn | |
nr | Ndebele (Southern) | hunspell-nr | |||
nso | Sotho (Northern) | hunspell-nso | |||
oc | Occitan | hunspell-oc | |||
om | Oromo | hunspell-om | Oromo details | ||
or | Oriya | hunspell-or | hyphen-or | ||
pa | Punjabi | hunspell-pa | hyphen-pa | ||
pap | Papiamentu/Papiamento | Papiamentu work in progress | The supported glibc locale is pap_AN. Spelling rules differ between Papiamentu and Papiamento groupings. Papiamentu: Curaçao and Bonaire, current members of the Netherlands Antillies, territory code AN. Papiamento: Aruba, (former member of the Netherlands Antillies), territory code AW, crubadan Papiamento corpus building. | ||
pl | Polish | hunspell-pl | hyphen-pl | mythes-pl | |
ps | Pashto | possible contact | |||
pt | Portuguese | hunspell-pt | hyphen-pt | mythes-pt | |
ro | Romanian | hunspell-ro | hyphen-ro | mythes-ro | |
ru | Russian | hunspell-ru | hyphen-ru | mythes-ru | |
rw | Kinyarwanda | hunspell-rw | |||
sa | Sanskrit | An apparent effort to create a Sanskrit hunspell dictionary | hyphen-sa | ||
sc | Sardinian | hunspell-sc | intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page | ||
sd | Sindhi | available | |||
se | Sammi, Northern | hunspell-se | watch this space | ||
shs | Secwepemctsin | hunspell-shs | Secwepecmtsín word bank work in progress. Note it's trivial to create a simple wordlist-based hunspell dict. e.g. wordlist2hunspell | ||
si | Sinhala | hunspell-si | Another very small wordlist | ||
sid | Sidamo | Some info | |||
sk | Slovak | hunspell-sk | hyphen-sk | mythes-sk | |
sl | Slovenian | hunspell-sl | hyphen-sl | mythes-sl | |
so | Somali | hunspell-so | |||
sq | Albanian | hunspell-sq | |||
sr | Serbian | hunspell-sr | hyphen-sr | ||
ss | Swati | hunspell-ss | |||
st | Sotho (Southern) | hunspell-st | |||
sv | Swedish | hunspell-sv | hyphen-sv | mythes-sv | |
ta | Tamil | hunspell-ta | hyphen-ta | ||
te | Telugu | hunspell-te | hyphen-te | ||
tg | Tajik | An apparent effort to create a Tajik hunspell dictionary | |||
th | Thai | hunspell-th | |||
ti | Tigrigna | hunspell-ti | |||
tig | Tigre | crubadan corpus building | |||
tk | Turkmen | hunspell-tk | hyphen-tk | ||
tl | Tagalog | hunspell-tl | |||
tn | Tswana | hunspell-tn | |||
tr | Turkish | available, but like Finnish through voikko the typical solution for Turkish has been the Zemberek library, and to have an enchant backend, an Openoffice.org Extension, and a Firefox extension) | |||
ts | Tsonga | hunspell-ts | |||
tt | Tatar | available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ? | available but difficult to see where this came from originally, and what license it is exactly, GPLv2+ (?). Perhaps it is an original work of ALT Linux and that actually is the canonical upstream ? | ||
ug | Uyghur | www.uyghurdictionary.org www.uighur.jp | |||
uk | Ukrainian | hunspell-uk | hyphen-uk | mythes-uk | |
ur | Urdu | hunspell-ur | |||
uz | Uzbek | hunspell-uz | |||
ve | Venda | hunspell-ve | |||
vi | Vietnamese | hunspell-vi | |||
wa | Walloon | hunspell-wa | |||
wo | Wolof | www.alfanet.anafa.org make Wolof localizations of Firefox and Abiword. www.dictionary.kasahorow.com | |||
xh | Xhosa | hunspell-xh | |||
yi | Yiddish | hunspell-yi | |||
yo | Yoruba | Some apparent efforts older info to create a Yoruba hunspell dictionary | www.dictionary.kasahorow.com | ||
zh | Chinese | Would these (convertable) TeX rules be universally meaningful for Chinese text | |||
zu | Zulu | hunspell-zu | hyphen-zu |
2. Language Support Matrix (extra OOo recognized not in glibc)
Language Code | Language | hunspell | hyphen | mythes | notes | |
ak | Akan | hunspell-ak | www.dictionary.kasahorow.com | |||
az | Azeri (Cyrillic) | transliteration table | ||||
bm | Bambara | Online Dictionary | ||||
buc | Bushi | |||||
brx | Bodo | xobdo is a potential source, but this isn't an option apparently at the moment. Another Online Dictionary | ||||
cop | Coptic | hunspell-cop | experimental convertible TeX rules | |||
dgo | Dogri | Central Institute for Indian Languages | ||||
dsb | Lower Sorbian | hunspell-dsb | ||||
ee | Ewe | available but no License mentioned. In private communication " We will specify licenses for the next release of the spell checkers. In the meantime, assume both Hausa and Eʋegbe have the GNU GPLv3 license as well." | online dictionary | |||
eo | Esperanto | hunspell-eo | needs more love to be convertible | |||
fj | Fijian | hunspell-fj | ||||
grc | Ancient Greek | hunspell-grc | hyphen-grc | |||
gsc | Gascon | Non-Commercial BY-NC-ND license | ||||
gug | Guarani | crubadan corpus building | ||||
haw | Hawaiian | hunspell-haw | ||||
hil | Hiligaynon | hunspell-hil | ||||
ia | Interlingua | hunspell-ia | hyphen-ia | |||
ki | Gikuyu | available | ||||
ksf | Bafia | work in progress empty dictionary page | ||||
la | Latin | hunspell-la | hyphen-la | |||
lb | Luxembourgish | hunspell-lb | mythes-lb | |||
ln | Lingala | hunspell-ln | ||||
ltg | Latgalian | available | Latgalian resources | |||
mos | Mossi | hunspell-mos | info. dictionary effort (hunspell has no problem with utf-8 .dic files FWIW) | |||
mni | Manipuri | some info | ||||
ny | Nyanja | hunspell-ny | ||||
plt | Malagasy, Plateau | hunspell-mg | Standard Malagasy | |||
qu | Quechua Ecuador | hunspell-qu | ||||
quh | Quechua South Bolivia | hunspell-quh | ||||
qul | Quechua North Bolivia | current effort | ||||
rm | Raeto-Romance/Romansh | Things are a bit messy as there's a group of R[h]aeto-Romance languages, but sil maps the ISO 639-1 rm to ISO 639-3 roh, and ethnologue documents the Swizz Offical Orthography for roh as Rumantsch Grischun, so that's the probable best-fit for this. Dicziunari Rumantsch Grischun | ||||
rue | Rusyn | |||||
sat | Santali | English<->Santali dictionaries online dictionary | ||||
sdc | Sardinian, Sassarese | intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page | ||||
sdn | Sardinian, Gallurese | intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page | ||||
sg | Sango | www.dictionary.kasahorow.com | ||||
sjd | Sammi, Kildin | Northern Sammi | ||||
sma | Sammi, Southern | Northern Sammi | ||||
smj | Sammi, Lule | hunspell-smj | watch this space | Northern Sammi | ||
smn | Sammi, Inari | Northern Sammi | ||||
sms | Sammi, Skolt | Northern Sammi | ||||
src | Sardinian, Logudorese | intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page | ||||
sro | Sardinian, Campidanese | intended dictionaries [https://launchpad.net/ditzionariusardu launchpad page | ||||
sw | Swahili | hunspell-sw | ||||
swb | Maore | swb information | ||||
tet | Tetum | hunspell-tet | ||||
tpi | Tok Pisin | hunspell-tpi | ||||
ty | Tahitian | crubadan corpus building |
3. Obsolete/Useless codes (glibc)
Language Code | Language | notes |
iw | Hebrew | Obsoleted by he |
no | Norwegian | Effectively obsoleted by nb |