From Fedora Project Wiki


Internationalization support for additional Indian languages

Summary

The government of India has given 22 "languages of the 8th Schedule" the status of official language. These languages are Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Manipuri/Meithei, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu, Urdu and English. In Fedora we already support most of these languages still there are missing languages like Manipuri, Dogri, Bodo and Santali. This is the feature to have internationalization support (Fonts, Input Method, Locale) for all these languages.

Owner


  • Email: pnemade@redhat.com
  • Email: psatpute@redhat.com

Current status

  • Targeted release: Fedora 17
  • Last updated: 2012-03-05
  • Percentage of completion: 100%

Detailed Description

Out of 22 Official Indian languages Fedora presently supports Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu and Maithili. Remaining list of India and there users as per 2001 census is as follows,

Language (Number of Speaker in Millions) - Script

1) Bodo (1.4 million) - Devanagari

2) Dogri (2.3 million) - Devanagari

3) Konkani (2.5 million) - Devanagari

4) Manipuri (1.5 million) - Bengali

5) Nepali (2.9 million) - Devanagari

6) Sanskrit (0.01 million) - Devanagari

7) Santali (6.5 million) - Devanagari

8) Urdu (52 million) - Perso Arabic

From last couple of year Fedora internationalization team trying to achieve complete Official 22 Indian language support and lots of work has been done already for this. see https://fedoraproject.org/wiki/Features/Sindhi

https://fedoraproject.org/wiki/Features/Kashmiri

https://fedoraproject.org/wiki/Features/Maithili

https://fedoraproject.org/wiki/Features/Konkani

In Fedora 17 release, fedora i18n team trying to achieve this milestone for Indian languages and provide fonts, input method and locale support for complete 22 official Indian languages.


Benefit to Fedora

1) Will know what are the missing things in Fedora for supporting these languages.

2) I18n support for 22 official Indian languages.

3) Due to i18n support user can go for localization for newly supported languages.

4) This huge language speaking community can use Fedora for doing work in respective language.


This is major Milestone for Fedora to support all 22 official Indian languages. In Last 5 years lots of work has been done for missing languages. i.e.

  • Getting information from linguist regarding missing things to complete language support.
  • Proposals to Unicode for Adding missing characters for these languages.
  • Standardization for input method.
  • Preparing CLDR.


This is first time in opensource, we are achieving this milestone.

Scope

Developers need to find locale information for these languages from community, take feedback regarding missing ligatures in fonts. Develop input methods. It will not affect other components.

This will need changes in following packages

1) Glibc localedata (still need locales for Dogri, Santali and Manipuri language)

  • These locales are now in F17 and Rawhide.

2) Bug fixes in Lohit and Other fonts for adding language specific ligatures.

  • Upstream release with bugfixes done on 29th Feb, build now available in Fedora 17.

3) Fixes in fontconfig .orth file, so proper fonts will get select for particular language.

  • Bugs fixed in upstream waiting for release.

4) Need this feature https://fedoraproject.org/wiki/Features/Inscript2_Keymaps

  • This is 100% complete and in Fedora 17 now

as it brings input methods for missing languages. (This is already 100% done and approved for Fedora 17)

How To Test

Following tests should work:

1. yum groupinstall <lang>-support

2. Select Regional language from System->Setting logout and login.

3. $fc-match :lang="" (Should return some font)

4.from ibus-menu select input method for the language.

5. Open webpage in Firefox having contents in these languages

User Experience

1. End user can login in system using there respective locales. 2. End users will get input method and fonts respective language.

Dependencies

1. glibc (upstream commit is required by upstream developer)

2. fontconfig (upstream commit is required by upstream developer)

3. inscript2

Contingency Plan

In case we do not achieve all required components for additional language, we can go with available things.


Documentation


Language Locale code Default Font package Input method fontconfig support
Bodo brx_IN lohit-devanagari-fonts brx-inscript2-deva.mim no
Dogri doi_IN lohit-devanagari-fonts doi-inscript2-deva.mim no
Konkani kok_IN lohit-devanagari-fonts kok-inscript2-deva.mim yes
Manipuri mni_IN, mni_IN@bengali lohit-bengali-fonts,tabish-eeyek-fonts mni-inscript2-beng.mim,mni-inscript2-mtei.mim no
Nepali ne_NP madan-fonts ne-inscript2-deva.mim yes
Sanskrit sa_IN lohit-devanagari-fonts sa-inscript2.mim yes
Santhali sat_IN@devanagari, sat_IN sat-inscript2-deva.mim,sat-inscript2-olck.mim no
Urdu ur_IN paktype-naqsh-fonts ur-phonetic.mim yes

Release Notes

  • Provides Internationalization Support (fonts, input method, locale) for Additional Bodo, Dogri, Konkani, Manipuri, Nepali, Sanskrit, Santali, Urdu languages.

Comments and Discussion