Internationalization support for additional Indian languages
Summary
The government of India has given 22 "languages of the 8th Schedule" the status of official language. These languages are Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Manipuri/Meithei, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu, Urdu and English. In Fedora we already support most of these languages still there are missing languages like Manipuri, Dogri, Bodo and Santali. This is the feature to have internationalization support (Fonts, Input Method, Locale) for all these languages.
Owner
- Name: User:pnemade
- Name: Pravin Satpute
- Email: pnemade@redhat.com
- Email: psatpute@redhat.com
Current status
- Targeted release: Fedora 17
- Last updated: 2012-02-17
- Percentage of completion: 80%
Detailed Description
Out of 22 Official Indian languages Fedora presently supports Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu and Maithili. Remaining list of India and there users as per 2001 census is as follows,
Language (Number of Speaker in Millions) - Script
1) Bodo (1.4 million) - Devanagari
2) Dogri (2.3 million) - Devanagari
3) Konkani (2.5 million) - Devanagari
4) Manipuri (1.5 million) - Bengali
5) Nepali (2.9 million) - Devanagari
6) Sanskrit (0.01 million) - Devanagari
7) Santali (6.5 million) - Devanagari
8) Urdu (52 million) - Perso Arabic
From last couple of year Fedora internationalization team trying to achieve complete Official 22 Indian language support and lots of work has been done already for this. see https://fedoraproject.org/wiki/Features/Sindhi
https://fedoraproject.org/wiki/Features/Kashmiri
https://fedoraproject.org/wiki/Features/Maithili
https://fedoraproject.org/wiki/Features/Konkani
In Fedora 17 release, fedora i18n team trying to achieve this milestone for Indian languages and provide fonts, input method and locale support for complete 22 official Indian languages.
Benefit to Fedora
1) Will know what are the missing things in Fedora for supporting these languages.
2) I18n support for 22 official Indian languages.
3) Due to i18n support user can go for localization for newly supported languages.
4) This huge language speaking community can use Fedora for doing work in respective language.
This is major Milestone for Fedora to support all 22 official Indian languages. In Last 5 years lots of work has been done for missing languages.
i.e.
- Getting information from linguist regarding missing things to complete language support.
- Proposals to Unicode for Adding missing characters for these languages.
- Standardization for input method.
- Preparing CLDR.
This is first time in opensource, we are achieving this milestone.
Scope
Developers need to find locale information for these languages from community, take feedback regarding missing ligatures in fonts. Develop input methods. It will not affect other components.
This will need changes in following packages
1) Glibc localedata (still need locales for Dogri, Santali and Manipuri language)
- These locales are now in F17 and Rawhide.
2) Bug fixes in Lohit and Other fonts for adding language specific ligatures.
- Upstream release with bugfixes by 21st Feb.
3) Fixes in fontconfig .orth file, so proper fonts will get select for particular language.
- Bug reporting is going on.
4) Need this feature https://fedoraproject.org/wiki/Features/Inscript2_Keymaps
- This is 100% complete and in Fedora 17 now
as it brings input methods for missing languages. (This is already 100% done and approved for Fedora 17)
How To Test
Following tests should work:
1. yum groupinstall <lang>-support
2. Select Regional language from System->Setting logout and login.
3. $fc-match :lang="" (Should return some font)
4.from ibus-menu select input method for the language.
5. Open webpage in Firefox having contents in these languages
User Experience
1. End user can login in system using there respective locales. 2. End users will get input method and fonts respective language.
Dependencies
1. glibc (upstream commit is required by upstream developer)
2. fontconfig (upstream commit is required by upstream developer)
3. inscript2
Contingency Plan
In case we do not achieve all required components for additional language, we can go with available things.
Documentation
Language | Locale code | Default Font package | Input method | fontconfig support |
Bodo | brx_IN | lohit-devanagari-fonts | brx-inscript2-deva.mim | no |
Dogri | doi_IN | lohit-devanagari-fonts | doi-inscript2-deva.mim | no |
Konkani | kok_IN | lohit-devanagari-fonts | kok-inscript2-deva.mim | yes |
Manipuri | mni_IN, mni_IN@bengali | lohit-bengali-fonts,tabish-eeyek-fonts | mni-inscript2-beng.mim,mni-inscript2-mtei.mim | no |
Nepali | ne_NP | madan-fonts | ne-inscript2-deva.mim | yes |
Sanskrit | sa_IN | lohit-devanagari-fonts | sa-inscript2.mim | yes |
Santhali | sat_IN@devanagari, sat_IN | sat-inscript2-deva.mim,sat-inscript2-olck.mim | no | |
Urdu | ur_IN | paktype-naqsh-fonts | ur-phonetic.mim | yes |
Release Notes
- Provides Internationalization Support (fonts, input method, locale) for Additional Bodo, Dogri, Konkani, Manipuri, Nepali, Sanskrit, Santali, Urdu languages.