About BhashaIndia | Contribute | SiteMap | Register | Sign in to Windows Live ID
  Developers Patrons
Hindi Tamil Kannada Gujarati Marathi Telugu Bengali Malayalam Punjabi Konkani Oriya Sanskrit Nepali
Home > Developers > KnowHow > CollationIntro > Modifiers Welcome Guest!

Modifiers

By Cathy Wissink & Michael S.Kaplan - Windows Globalization, Microsoft Corporation

Modifiers are exactly that—those elements within a writing system that modify linguistic characters. You often see them somewhere around a linguistic character (e.g., on top, underneath, to the side). Some examples include:

The acute sign in Latin e acute (É);
The tonos sign in Greek epsilon tonos (έ);
The dagesh in Hebrew Fay (or Pay + Dagesh) (פ);
The nukta in Devanagari Qa (or Ka + nukta) (क).

As you have perhaps ascertained by now, modifiers also impact ordering in languages. Often, they will have a lesser weight, not unlike case. For example, it could be the case that in a Latin-based language, diacritics are sorted in the following manner: acute < grave < diaeresis.

This language would, as a result, sort characters with diacritics in the following order:
á (Lower Case A Acute)
Á (Upper Case A Acute)
à (Lower Case A Grave)
À (Upper Case A Grave)
ä (Lower Case A Diaeresis)
Ä (Upper Case A Diaeresis)

Most of the time, this modifier information is considered more important than case but less important than the base character. The reason for this is simple – in most instances, people expect the "accented" versions of a letter to be sorted in the same way independent of their case.

Another way in which you can see the less distinct role of case is in the frequent use of case-insensitive searching and sorting, compared to modifier-insensitive ("diacritic insensitive") searching and sorting. Two characters that only differ by case are considered by most users to be the "same" character, whereas that cannot always be said for two characters that differ only by modifier. (This of course does depend on the language.)

Now that you have a basic idea of how some linguistic elements might impact collation, let's look at some language examples to put these concepts into practice.

Partner Profile | Privacy Statement | Why Passport | Testimonials
This site uses Unicode for non-English characters and uses Open Type fonts.
©2003-2007 Microsoft Corporation. All rights reserved.