Unicode overview

An evolving standard, Unicode defines a single code page that includes most symbols—letters, ideograms, syllabics (such as the Japanese Kana symbols), punctuation, diacritics, mathematical symbols, technical symbols, and so on—from most of the languages of the world, and assigns each symbol a numeric value—originally, a number between zero and 65,535, the range of an unsigned 16-bit integer.

As it turned out, Unicode's original limit of 65,536 symbols proved too small, and the limit was extended to well over 1,000,000 symbols. Several ways of encoding each symbol were defined, and the encodings were designed so that you can convert from one to another any number of times without losing any information. For more information on the algorithms for converting between encodings, see the Unicode Web site, http://www.unicode.org. OpenEdge supports Unicode's UTF-8 encoding. In addition, all varieties of UTF-16 and UTF-32 are supported for input and output and for LONGCHARs and CLOBs.