Valid and invalid code-page conversions

Sometimes converting from one code page to another garbles the data. This section states rules for determining if a particular code-page conversion garbles the data. In these rules, source code page means the code page you are converting from, target code page means the code page you are converting to, and a valid code-page conversion is one that does not garble the data. The rules are:

A code-page conversion is valid if each character that appears in the source code page appears in the target code page.

For example, it is valid to convert from the IBM850 code page (used for Latin-alphabet languages of western Europe and the Americas) to the ISO8859-1 code page (also used for Latin-alphabet languages of western Europe and the Americas) because each character that appears in IBM850 appears in ISO8859-1. Similarly, it is not valid to convert from the CP949 code page (used for Korean) to the KSC5601 code page (also used for Korean) because CP949 contains characters that KSC5601 does not.

A code-page conversion is valid if every character that appears in the data appears in the target coded page.

Sometimes, although one or more characters in the source code page do not appear in the target code page, none of these troublesome characters appears in the data. For example, suppose that you want to convert data from one code page to another, and that the source and target code pages are identical except that the source code page contains the Euro currency character while the target code page does not. If the Euro currency symbol does not appear in the data, this code-page conversion is valid.

To check a text file for the presence of a particular character, use the PROUTIL utility with the CONVCHAR CHARSCAN qualifier. For more information on this technique, see Using Databases.