Try OpenEdge Now
skip to main content
Internationalizing Applications
Using Multi-byte Code Pages : Inside the multi-byte application : Techniques for working with multi-byte characters : Testing for a lead-byte value
 
Testing for a lead-byte value
The next technique involves testing a byte for a lead-byte value. Lead bytes (and trail bytes) often have special values to distinguish them. The following table lists the lead-byte and trail-byte values for the multi-byte code pages OpenEdge supports.
Table 15. Lead byte and trail byte values
Code page
Language or standard
Lead-byte values
Trail-byte values
BIG-5
Traditional Chinese
161 through 254
64 through 126161 through 254
CP949
Korean
129 through 254
65 through 9097 through 122129 through 254
CP950
Traditional Chinese
129 through 254
64 through 126128 through 254
CP936
Simplified Chinese
129 through 254
64 through 126128 through 254
CP1361
Korean
132 through 211216 through 222224 through 249
65 through 127129 through 254
EUCJIS
Japanese
142164 through 254
161 through 254
GB2312
Simplified Chinese
161 through 254
161 through 254
GB180301
Extended Chinese
-
-
KSC5601
Korean
161 through 254
161 through 254
SHIFT-JIS
Japanese
129 through 159224 through 252
64 through 126128 through 252
UTF-8
Unicode
193 through 239
128 through 191

1 The GB18030 code page is a multi-byte code page, consisting of one-, two-, and four-byte characters, that extends the GB2312 code page and includes all characters defined in Unicode. Unlike most multi-byte code pages that OpenEdge supports, you cannot use the lead byte of multi-byte characters in the GB18030 code page to determine the character's length. OpenEdge uses the International Components for Unicode (ICU) library to convert characters between the GB18030 code page and Unicode within the OpenEdge GUI client.

You cannot always assume a byte with a lead-byte value is a lead byte, or a byte with a trail-byte value is a trail byte. This is because the possible values for trail bytes overlap those of lead bytes and single bytes. For example, the value 164 can correspond to a lead byte or a trail byte. To determine which it is, you must inspect the string.
To determine if a byte has a lead-byte value, use the IS-LEAD-BYTE function, which evaluates a character expression and returns YES if the first byte of the first character of the character string has a value within the range permitted for lead bytes. Otherwise, IS-LEAD-BYTE returns NO. IS-LEAD-BYTE has the following syntax:
IS-LEAD-BYTE ( string )
string
A character expression (a constant, field name, variable name, or any combination of these) whose value is a character.
In the following example, IS-LEAD-BYTE examines a string whose first character is single byte. Since the first byte of the first character of the string is not a lead byte, its value is not within the range permitted for lead bytes, IS-LEAD-BYTE returns NO, and the example displays Lead: no:
DEFINE VARIABLE lLead AS LOGICAL NO-UNDO.
lLead = IS-LEAD-BYTE("xy ").
DISPLAY lLead WITH 1 COLUMN.
The following example is identical to the preceding example except that the first character of the string is double byte. Since the first byte of the first character of the string is a lead byte, its value falls within the range permitted for lead bytes, IS-LEAD-BYTE returns YES, and the example displays Lead: yes:
DEFINE VARIABLE lLead AS LOGICAL NO-UNDO.
lLead = IS-LEAD-BYTE(" xy").
DISPLAY lLead WITH 1 COLUMN.