Collating multi-byte characters

When you sort multi-byte characters, you face a question that you do not face when sorting single-byte characters: In what order should the different types of characters sort? That is, should all one-byte characters sort before all two-byte characters? Should all two-byte characters sort before all three-byte characters? And how should the user-defined characters of the SHIFT-JIS code page sort?

The default collation table OpenEdge provides for the double-byte Asian languages (Chinese, Japanese, and Korean) sorts all single-byte characters before all double-byte characters. The following table shows how OpenEdge sorts Japanese characters.

Table 18. Japanese collation order by character type
Character type	Range of values
Single byte (ASCII)	0-127
Single byte (half-width Katakana)	160-223
Lead byte (range 1)	129-159
Lead byte (range 2)	224-239
User-defined (Gaiji)	240-252

Note: You can modify the sort order of lead bytes, though not the sort order of trail bytes. For more information on modifying the sort order of lead bytes, see the comments in the BASIC collation table for the SHIFT-JIS code page in the japanese.dat file in the OpenEdge/prolang/convmap directory.

In this section:

Sort order of trail bytes