About BhashaIndia | Contribute | SiteMap | Register | Sign in to Windows Live ID
  Developers Patrons
Hindi Tamil Kannada Gujarati Marathi Telugu Bengali Malayalam Punjabi Konkani Oriya Sanskrit Nepali
Home > Developers > IndianLang > IndicScript > Globalization solution Welcome Guest!

Unicode:
Globalization solution for Indic script languages

Collation for any language has certain structural needs that cannot be met by an encoding, and Unicode as an encoding is no exception. However, it should also be clear that character encodings were not developed to work as acceptable sorting orders; encodings are for the sole purpose of providing a relationship between a linguistic character and a number.
So if developers cannot count on encoding order to be a viable sorting order for Indic scripts and languages, how can they get culturally correct results? In addition, what else needs to be considered to properly globalize software for India (or any part of the world) if Unicode is only part of a globalization solution?
There are commercial products available today built on Unicode that have implemented culturally correct sorting for Indic languages. For example, Windows 2000 shipped with full language support for Hindi and Tamil, including linguistic collation.
(Windows XP will ship with additional support for other Indic languages, including collation for Telugu, Kannada, Punjabi and Gujarati). As the development team did not expect that encoding order would be sufficient for the Indic languages (since it was not the case with all other linguistic collation support on Windows), it was apparent that collation would need to be researched to add this information to the data tables which support such functions as LCMapString (for sortkey generation) and CompareString (for sorting).
For this reason, part of the Windows International development team has been tasked with researching and implementing linguistic collation for all locales (cultural/regional combinations) supported in product. Unicode is considered just an encoding, not saddled with any expectations for collation; collation is to be handled elsewhere (i.e., the above-named functions). As a result, it is possible to get linguistic collation (and other =functionality) for supported Indic languages on any version of Windows 2000 or Windows XP, including the English version. Unicode was the underlying encoding used in both products, but collation was built in elsewhere.
What else needs to be considered for full Indic support in software, if one implements Unicode? A character encoding (specifically Unicode) cannot carry the burden of collation; it also cannot be responsible for all glyph representations of a character (as has also been proposed by certain implementers in the Indian development community).
In order to correctly support proper character display and layout of a language, input methods, font support and rendering engines must be properly leveraged or implemented. In addition, full NLS (National Language Support) data for formatting of time, dates, numbers, currencies and other locale elements needs to be consideredIn order to correctly support proper character display and layout of a language, input methods, font support and rendering engines must be properly leveraged or implemented. In addition, full NLS (National Language Support) data for formatting of time, dates, numbers, currencies and other locale elements needs to be considered
Again, it should be emphasized that if Unicode is treated as an encoding, without the burden of being a catchall for globalization (collation, glyph representation, input, etc.), it is the best encoding solution for software that runs worldwide. After having developed software for many international markets with Unicode, Microsoft as a vendor is even more convinced of the benefits of Unicode for both individual and worldwide markets, and is committed to continuing global development with Unicode into the future.
Software vendors are still in the early stages of developing software that is fully satisfactory for the Indian market on all levels of globalization. It will likely take a few more iterations of the products and on-going refinement of the implementations before the Indic language development community is completely happy with the available products; vendors are working with this community to ensure that the community's and end-users' feedback is integrated into their products. However, as the Windows development team has found over the last ten years, implementing Unicode makes development for the worldwide market (including the Indic languages) not more complicated, but considerably easier; Unicode is the future of worldwide software, including that software for the Indic language market.

Print Print
Broadcast Broadcast
Save this Article Save
E-mail this article link E-Mail
Rate this article
Related Articles
Contribute an article

Also read:

Related articles
Rate this article
1 2 3 4 5 6 7 8 9
Poor Outstanding
Tell us why you rated the content this way. [Optional]
 

Average rating:
7 out of 9
1 2 3 4 5 6 7 8 9
6 people have rated this article
Partner Profile | Privacy Statement | Why Passport | Testimonials
This site uses Unicode for non-English characters and uses Open Type fonts.
©2003-2007 Microsoft Corporation. All rights reserved.