Unicode Utility Library 3.0.2

Delphi 3, 4, 5, 6, and Kylix Implementation

Dieter Köhler

Special thanks to (in alphabetical order): Lucjan Łyczak, Micha Nelissen, Ernst van der Pols and Karl Waclawek.:

LICENSE

The contents of the Extended Document Object Model files are subject to the Mozilla Public License Version 1.1 (the "License"); you may not use this files except in compliance with the License. You may obtain a copy of the License at "http://www.mozilla.org/MPL/"

Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License.

The Original Code is "UnicodeUtils.pas".

The Initial Developer of the Original Code is Dieter Köhler (Heidelberg, Germany, "http://www.philo.de/"). Portions created by the Initial Developer are Copyright (C) 1999-2003 Dieter Köhler. All Rights Reserved.

Alternatively, the contents of this files may be used under the terms of the GNU General Public License Version 2 or later (the "GPL"), in which case the provisions of the GPL are applicable instead of those above. If you wish to allow use of your version of this files only under the terms of the GPL, and not to allow others to use your version of this files under the terms of the MPL, indicate your decision by deleting the provisions above and replace them with the notice and other provisions required by the GPL. If you do not delete the provisions above, a recipient may use your version of this file under the terms of any one of the MPL or the GPL.

2003


Table of Contents

Introduction
Exception Classes
EConversionStream
General Classes and Constants
TdomEncodingType
TdomEncodingTypes
SINGLE_BYTE_ENCODINGS
MULTI_BYTE_ENCODINGS
TCharToUTF16ConvFunc
TUTF16ToCharConvFunc
Helper Functions for Encoding Type Detection
GetACPEncodingName
GetACPEncodingType
EncodingToStr
StrToEncoding
Specialized Streams
TConversionStream
TUTF16BEToUTF8Stream
TUTF16BEToSingleByteCharsetStream
Helper Functions for UTF-16 Surrogate Processing
Utf16HighSurrogate
Utf16LowSurrogate
Utf16SurrogateToInt
IsUtf16HighSurrogate
IsUtf16LowSurrogate
Helper Functions for Conversion Function Detection
GetCharToUTF16ConvFunc
GetUTF16ToCharConvFunc
Conversion Functions between UTF-8 and UTF-16BE
UTF8ToUTF16BEStr
UTF16BEToUTF8Str
General Character Conversion Functions to UTF-16BE
SingleByteEncodingToUTF16Char
Special Character Conversion Functions to UTF-16BE
US_ASCIIToUTF16Char
Iso8859_1ToUTF16Char
Iso8859_2ToUTF16Char
Iso8859_3ToUTF16Char
Iso8859_4ToUTF16Char
Iso8859_5ToUTF16Char
Iso8859_6ToUTF16Char
Iso8859_7ToUTF16Char
Iso8859_8ToUTF16Char
Iso8859_9ToUTF16Char
Iso8859_10ToUTF16Char
Iso8859_13ToUTF16Char
Iso8859_14ToUTF16Char
Iso8859_15ToUTF16Char
KOI8_RToUTF16Char
JIS_X0201ToUTF16Char
nextStepToUTF16Char
cp10000_MacRomanToUTF16Char
cp10006_MacGreekToUTF16Char
cp10007_MacCyrillicToUTF16Char
cp10029_MacLatin2ToUTF16Char
cp10079_MacIcelandicToUTF16Char
cp10081_MacTurkishToUTF16Char
cp037ToUTF16Char
cp424ToUTF16Char
cp437ToUTF16Char
cp437_DOSLatinUSToUTF16Char
cp500ToUTF16Char
cp737_DOSGreekToUTF16Char
cp775_DOSBaltRimToUTF16Char
cp850ToUTF16Char
cp850_DOSLatin1ToUTF16Char
cp852ToUTF16Char
cp852_DOSLatin2ToUTF16Char
cp855ToUTF16Char
cp855_DOSCyrillicToUTF16Char
cp856_Hebrew_PCToUTF16Char
cp857ToUTF16Char
cp857_DOSTurkishToUTF16Char
cp860ToUTF16Char
cp860_DOSPortugueseToUTF16Char
cp861ToUTF16Char
cp861_DOSIcelandicToUTF16Char
cp862ToUTF16Char
cp862_DOSHebrewToUTF16Char
cp863ToUTF16Char
cp863_DOSCanadaFToUTF16Char
cp864ToUTF16Char
cp864_DOSArabicToUTF16Char
cp865ToUTF16Char
cp865_DOSNordicToUTF16Char
cp866ToUTF16Char
cp866_DOSCyrillicRussianToUTF16Char
cp869ToUTF16Char
cp869_DOSGreek2ToUTF16Char
cp874ToUTF16Char
cp875ToUTF16Char
cp1006ToUTF16Char
cp1026ToUTF16Char
cp1250ToUTF16Char
cp1251ToUTF16Char
cp1252ToUTF16Char
cp1253ToUTF16Char
cp1254ToUTF16Char
cp1255ToUTF16Char
cp1256ToUTF16Char
cp1257ToUTF16Char
cp1258ToUTF16Char
String Conversion Functions to UTF-16BE
US_ASCIIToUTF16Str
Iso8859_1ToUTF16Str
Iso8859_2ToUTF16Str
Iso8859_3ToUTF16Str
Iso8859_4ToUTF16Str
Iso8859_5ToUTF16Str
Iso8859_6ToUTF16Str
Iso8859_7ToUTF16Str
Iso8859_8ToUTF16Str
Iso8859_9ToUTF16Str
Iso8859_10ToUTF16Str
Iso8859_13ToUTF16Str
Iso8859_14ToUTF16Str
Iso8859_15ToUTF16Str
KOI8_RToUTF16Str
JIS_X0201ToUTF16Str
nextStepToUTF16Str
cp10000_MacRomanToUTF16Str
cp10006_MacGreekToUTF16Str
cp10007_MacCyrillicToUTF16Str
cp10029_MacLatin2ToUTF16Str
cp10079_MacIcelandicToUTF16Str
cp10081_MacTurkishToUTF16Str
cp037ToUTF16Str
cp424ToUTF16Str
cp437ToUTF16Str
cp437_DOSLatinUSToUTF16Str
cp500ToUTF16Str
cp737_DOSGreekToUTF16Str
cp775_DOSBaltRimToUTF16Str
cp850ToUTF16Str
cp850_DOSLatin1ToUTF16Str
cp852ToUTF16Str
cp852_DOSLatin2ToUTF16Str
cp855ToUTF16Str
cp855_DOSCyrillicToUTF16Str
cp856_Hebrew_PCToUTF16Str
cp857ToUTF16Str
cp857_DOSTurkishToUTF16Str
cp860ToUTF16Str
cp860_DOSPortugueseToUTF16Str
cp861ToUTF16Str
cp861_DOSIcelandicToUTF16Str
cp862ToUTF16Str
cp862_DOSHebrewToUTF16Str
cp863ToUTF16Str
cp863_DOSCanadaFToUTF16Str
cp864ToUTF16Str
cp864_DOSArabicToUTF16Str
cp865ToUTF16Str
cp865_DOSNordicToUTF16Str
cp866ToUTF16Str
cp866_DOSCyrillicRussianToUTF16Str
cp869ToUTF16Str
cp869_DOSGreek2ToUTF16Str
cp874ToUTF16Str
cp875ToUTF16Str
cp1006ToUTF16Str
cp1026ToUTF16Str
cp1250ToUTF16Str
cp1251ToUTF16Str
cp1252ToUTF16Str
cp1253ToUTF16Str
cp1254ToUTF16Str
cp1255ToUTF16Str
cp1256ToUTF16Str
cp1257ToUTF16Str
cp1258ToUTF16Str
Character Conversion Functions from UTF-16
UTF16ToUS_ASCIIChar
UTF16ToIso8859_1Char
UTF16ToIso8859_2Char
UTF16ToIso8859_3Char
UTF16ToIso8859_4Char
UTF16ToIso8859_5Char
UTF16ToIso8859_6Char
UTF16ToIso8859_7Char
UTF16ToIso8859_8Char
UTF16ToIso8859_9Char
UTF16ToIso8859_10Char
UTF16ToIso8859_13Char
UTF16ToIso8859_14Char
UTF16ToIso8859_15Char
UTF16ToKOI8_RChar
UTF16ToJIS_X0201Char
UTF16ToNextStepChar
UTF16ToCp10000_MacRomanChar
UTF16ToCp10006_MacGreekChar
UTF16ToCp10007_MacCyrillicChar
UTF16ToCp10029_MacLatin2Char
UTF16ToCp10079_MacIcelandicChar
UTF16ToCp10081_MacTurkishChar
UTF16ToCp037Char
UTF16ToCp424Char
UTF16ToCp437Char
UTF16ToCp437_DOSLatinUSChar
UTF16ToCp500Char
UTF16ToCp737_DOSGreekChar
UTF16ToCp775_DOSBaltRimChar
UTF16ToCp850Char
UTF16ToCp850_DOSLatin1Char
UTF16ToCp852Char
UTF16ToCp852_DOSLatin2Char
UTF16ToCp855Char
UTF16ToCp855_DOSCyrillicChar
UTF16ToCp856_Hebrew_PCChar
UTF16ToCp857Char
UTF16ToCp857_DOSTurkishChar
UTF16ToCp860Char
UTF16ToCp860_DOSPortugueseChar
UTF16ToCp861Char
UTF16ToCp861_DOSIcelandicChar
UTF16ToCp862Char
UTF16ToCp862_DOSHebrewChar
UTF16ToCp863Char
UTF16ToCp863_DOSCanadaFChar
UTF16ToCp864Char
UTF16ToCp864_DOSArabicChar
UTF16ToCp865Char
UTF16ToCp865_DOSNordicChar
UTF16ToCp866Char
UTF16ToCp866_DOSCyrillicRussianChar
UTF16ToCp869Char
UTF16ToCp869_DOSGreek2Char
UTF16ToCp874Char
UTF16ToCp875Char
UTF16ToCp1006Char
UTF16ToCp1026Char
UTF16ToCp1250Char
UTF16ToCp1251Char
UTF16ToCp1252Char
UTF16ToCp1253Char
UTF16ToCp1254Char
UTF16ToCp1255Char
UTF16ToCp1256Char
UTF16ToCp1257Char
UTF16ToCp1258Char
String Conversion Functions from UTF-16BE
UTF16ToUS_ASCIIStr
UTF16ToIso8859_1Str
UTF16ToIso8859_2Str
UTF16ToIso8859_3Str
UTF16ToIso8859_4Str
UTF16ToIso8859_5Str
UTF16ToIso8859_6Str
UTF16ToIso8859_7Str
UTF16ToIso8859_8Str
UTF16ToIso8859_9Str
UTF16ToIso8859_10Str
UTF16ToIso8859_13Str
UTF16ToIso8859_14Str
UTF16ToIso8859_15Str
UTF16ToKOI8_RStr
UTF16ToJIS_X0201Str
UTF16ToNextStepStr
UTF16ToCp10000_MacRomanStr
UTF16ToCp10006_MacGreekStr
UTF16ToCp10007_MacCyrillicStr
UTF16ToCp10029_MacLatin2Str
UTF16ToCp10079_MacIcelandicStr
UTF16ToCp10081_MacTurkishStr
UTF16ToCp037Str
UTF16ToCp424Str
UTF16ToCp437Str
UTF16ToCp437_DOSLatinUSStr
UTF16ToCp500Str
UTF16ToCp737_DOSGreekStr
UTF16ToCp775_DOSBaltRimStr
UTF16ToCp850Str
UTF16ToCp850_DOSLatin1Str
UTF16ToCp852Str
UTF16ToCp852_DOSLatin2Str
UTF16ToCp855Str
UTF16ToCp855_DOSCyrillicStr
UTF16ToCp856_Hebrew_PCStr
UTF16ToCp857Str
UTF16ToCp857_DOSTurkishStr
UTF16ToCp860Str
UTF16ToCp860_DOSPortugueseStr
UTF16ToCp861Str
UTF16ToCp861_DOSIcelandicStr
UTF16ToCp862Str
UTF16ToCp862_DOSHebrewStr
UTF16ToCp863Str
UTF16ToCp863_DOSCanadaFStr
UTF16ToCp864Str
UTF16ToCp864_DOSArabicStr
UTF16ToCp865Str
UTF16ToCp865_DOSNordicStr
UTF16ToCp866Str
UTF16ToCp866_DOSCyrillicRussianStr
UTF16ToCp869Str
UTF16ToCp869_DOSGreek2Str
UTF16ToCp874Str
UTF16ToCp875Str
UTF16ToCp1006Str
UTF16ToCp1026Str
UTF16ToCp1250Str
UTF16ToCp1251Str
UTF16ToCp1252Str
UTF16ToCp1253Str
UTF16ToCp1254Str
UTF16ToCp1255Str
UTF16ToCp1256Str
UTF16ToCp1257Str
UTF16ToCp1258Str
CSMIB classes
CSMIB Exception Classes
CSMIB Event Classes
The TCSMIB Class
References

Introduction

The Unicode Utility Library (UUL) contains several classes and helper functions to support processing and conversion of Unicode character data. Unicode is a character encoding standard that covers all major scripts of the world. For more information on Unicode and its related standards see the resources mentioned in the References section. The conversion functions are based on the mapping tables which can be found on the CD-ROM accompanying [Unicode 3.0].

Also included is a TCSMIB component for easy access to the Management Information Base (MIB) for character set encodings as specified in [CSMIB]. (This specification is occasionally used in the documentation below even if not explicitly quoted.)

The UUL was built and tested using Delphi 7. It was not tested with any other version of Delphi, Kylix or C++ Builder. Nevertheless, it should also run with Delphi 3, 4, 5 and 6, Kylix 1, 2 and 3, and compatible C++ Builder versions. To use the UUL in a Delphi unit just include a reference to UnicodeConv in its uses clause and make sure that the location of the file UnicodeConv.pas is included in the library path list of your Delphi IDE. To use the TCSMIB component at design time add it to an existing or newly created package via the "Component --> Install Component ..." menu item of the Delphi IDE. If not already available, a new "XML" palette page will appear containing the TCSMIB component.

The UUL is under permanent development. The latest version of this Software can be obtained via the OpenXML web-site at "http://www.philo.de/xml/". The preferred way to contact the author is via the OpenXML mailing list. Instructions how to join the mailing list can be found at "http://www.philo.de/xml/" as well.

Exception Classes

EConversionStream

EConversionStream = class(EStreamError);

This is the fundamental class of all TConversionStream exceptions.

General Classes and Constants

TdomEncodingType

  TdomEncodingType = (etUnknown, etUTF_8, etUTF_16BE, etUTF_16LE,
                      etISO_10646_UCS_2, etUS_ASCII,
                      etIso_8859_1, etIso_8859_2, etIso_8859_3, etIso_8859_4,
                      etIso_8859_5, etIso_8859_6, etIso_8859_7, etIso_8859_8,
                      etIso_8859_9, etIso_8859_10, etIso_8859_13, etIso_8859_14,
                      etIso_8859_15, etKOI8_R, etJIS_X0201, etNextStep,
                      etCp10000_MacRoman, etCp10006_MacGreek,
                      etCp10007_MacCyrillic, etCp10029_MacLatin2,
                      etCp10079_MacIcelandic, etCp10081_MacTurkish,
                      etIBM037, etIBM424, etIBM437,
                      etDOS_437, etIBM500, etDOS_737, etDOS_775, etIBM850,
                      etDOS_850, etIBM852, etDOS_852, etIBM855, etDOS_855,
                      etPC_856, etIBM857, etDOS_857, etIBM860, etDOS_860,
                      etIBM861, etDOS_861, etIBM862, etDOS_862, etIBM863,
                      etDOS_863, etIBM864, etDOS_864, etIBM865, etDOS_865,
                      etIBM866, etDOS_866, etIBM869, etDOS_869, etCp874,
                      etCp875, etCp1006,
                      etIBM1026, etWindows_1250, etWindows_1251, etWindows_1252,
                      etWindows_1253, etWindows_1254, etWindows_1255,
                      etWindows_1256, etWindows_1257, etWindows_1258);

Constants for all supported encoding schemata plus an etUnknown constant.

TdomEncodingTypes

TdomEncodingTypes = set of TdomEncodingType;

Defines a set of TdomEncodingType constants.

SINGLE_BYTE_ENCODINGS

SINGLE_BYTE_ENCODINGS: TdomEncodingTypes =
                     [etUS_ASCII, etIso_8859_1, etIso_8859_2, etIso_8859_3,
                      etIso_8859_4, etIso_8859_5, etIso_8859_6,etIso_8859_7,
                      etIso_8859_8, etIso_8859_9, etIso_8859_10, etIso_8859_13,
                      etIso_8859_14, etIso_8859_15, etKOI8_R, etJIS_X0201,
                      etNextStep, etCp10000_MacRoman, etCp10006_MacGreek,
                      etCp10007_MacCyrillic, etCp10029_MacLatin2,
                      etCp10079_MacIcelandic, etCp10081_MacTurkish, etIBM037,
                      etIBM424, etIBM437, etDOS_437, etIBM500, etDOS_737,
                      etDOS_775, etIBM850, etDOS_850, etIBM852, etDOS_852,
                      etIBM855, etDOS_855, etPC_856, etIBM857, etDOS_857,
                      etIBM860, etDOS_860, etIBM861, etDOS_861, etIBM862,
                      etDOS_862, etIBM863, etDOS_863, etIBM864, etDOS_864,
                      etIBM865, etDOS_865, etIBM866, etDOS_866, etIBM869,
                      etDOS_869, etCp874, etCp875, 
                      etCp1006, etIBM1026, etWindows_1250,
                      etWindows_1251, etWindows_1252, etWindows_1253,
                      etWindows_1254, etWindows_1255, etWindows_1256,
                      etWindows_1257, etWindows_1258];

Defines a constant set of TdomEncodingType constants for all supported single byte encodings.

MULTI_BYTE_ENCODINGS

MULTI_BYTE_ENCODINGS: TdomEncodingTypes =
                     [etUTF_8, etUTF_16BE, etUTF_16LE, etISO_10646_UCS_2];

Defines a constant set of TdomEncodingType constants for all supported multi byte encodings.

TCharToUTF16ConvFunc

TCharToUTF16ConvFunc = function(const W: word): WideChar;

Procedural type for conversion functions of single byte characters into UTF-16BE.

TUTF16ToCharConvFunc

TUTF16ToCharConvFunc = function(const I: longint): Char;

Procedural type for conversion functions of UTF-16BE characters into single byte encodings.

Helper Functions for Encoding Type Detection

GetACPEncodingName

function GetACPEncodingName: String;

Returns the name of the current active code page of the Windows operating system. This function is not available in Kylix.

Return Value

The name of the current active code page.

GetACPEncodingType

function GetACPEncodingType: TdomEncodingType;

Returns the encoding type of the current active code page of the Windows operating system. This function is not available in Kylix.

Return Value

The encoding type of the current active code page.

Exceptions

EConvertError

Raised if the encoding scheme is not supported which is the case for cp932, cp936, cp949, cp950.

EncodingToStr

function EncodingToStr(const Encoding: TdomEncodingType): String;

Returns the standard name of the specified character encoding.

Parameters

Encoding

The encoding type of a character encoding.

Return Value

The standard name of the specified character encoding.

Exceptions

EConvertError

Raised if the specified encoding type is etUnknown.

StrToEncoding

function StrToEncoding(const S: String): TdomEncodingType;

Converts the name of a character set into a TdomEncodingType.

Parameters

S

The name of a character encoding.

Return Value

The equivalent of 'S' as a TdomEncodingType, or 'etUnknown' if no equivalent was found.

Specialized Streams

TConversionStream

TConversionStream = class (TStream)

TConversionStream is an input/output stream for other streams. Its purpose is to transform data as they are written to or read from a target stream.

TUTF16BEToUTF8Stream

TUTF16BEToUTF8Stream = class (TConversionStream)

TUTF16BEToUTF8Stream is a descendant from TConversionStream which converts an UTF-16BE stream into an UTF-8 encoded stream.

TUTF16BEToSingleByteCharsetStream

TUTF16BEToSingleByteCharsetStream = class (TConversionStream)

TUTF16BEToSingleByteCharsetStream is a descendant from TConversionStream which converts an UTF-16BE stream into a single byte encoded stream.

Helper Functions for UTF-16 Surrogate Processing

The following functions serve for UTF-16 surrogate processing:

Utf16HighSurrogate

function Utf16HighSurrogate(const value: integer): WideChar;

Extracts the high surrogate of a number out of the interval [$10000;$10FFFF].

Parameters

value

The number from which the high surrogate is to be extracted.

Return Value

The high surrogate as a WideChar.

Exceptions

EConvertError

This Delphi exception is raised if the specified value is not contained in the interval [$10000;$10FFFF].

Utf16LowSurrogate

function Utf16LowSurrogate(const value: integer): WideChar;

Extracts the low surrogate of a number out of the interval [$10000;$10FFFF].

Parameters

value

The number from which the low surrogate is to be extracted.

Return Value

The low surrogate as a WideChar.

Exceptions

EConvertError

This Delphi exception is raised if the specified value is not contained in the interval [$10000;$10FFFF].

Utf16SurrogateToInt

function Utf16SurrogateToInt(const highSurrogate, lowSurrogate: WideChar): integer;

Transforms a high surrogate plus a low surrogate into an integer.

Parameters

highSurrogate

The high surrogate part of the integer.

lowSurrogate

The low surrogate part of the integer.

Return Value

The integer value of the high surrogate plus the low surrogat.

Exceptions

EConvertError

This Delphi exception is raised if the ordinal value of the highSurrogate is not contained in the interval [$D800;$DBFF] or if the ordinal value of the lowSurrogate is not contained in the interval [$DC00;$DFFF].

IsUtf16HighSurrogate

function IsUtf16HighSurrogate(const S: WideChar): boolean;

Tests whether the specified WideChar is an UTF16 high surrogate.

Parameters

S

The wideChar to test.

Return Value

'True' if the specified WideChar is an UTF16 high surrogate, otherwise 'false'.

IsUtf16LowSurrogate

function IsUtf16LowSurrogate(const S: WideChar): boolean;

Tests whether the specified WideChar is an UTF16 low surrogate.

Parameters

S

The wideChar to test.

Return Value

'True' if the specified WideChar is an UTF16 low surrogate, otherwise 'false'.

Helper Functions for Conversion Function Detection

GetCharToUTF16ConvFunc

function GetCharToUTF16ConvFunc(Encoding: TdomEncodingType): TCharToUTF16ConvFunc;

Returns the character conversion function for the specified TdomEncodingType into UTF-16BE.

Parameters

Encoding

The source encoding of the conversion function to find.

Return Value

The character conversion function for the specified TdomEncodingType into UTF-16BE or nil of none was found.

GetUTF16ToCharConvFunc

function GetUTF16ToCharConvFunc(Encoding: TdomEncodingType): TUTF16ToCharConvFunc;

Returns the character conversion function for UTF-16BE into the specified TdomEncodingType.

Parameters

Encoding

The target encoding of the conversion function to find.

Return Value

The character conversion function for UTF-16BE into the specified TdomEncodingType or nil of none was found.

Conversion Functions between UTF-8 and UTF-16BE

UTF8ToUTF16BEStr

function UTF8ToUTF16BEStr(const S: string): WideString;

Converts an UTF-8 string into an UTF-16BE wideString. No special conversions (e.g. on line breaks) and no XML-Char checking are done. If 'S' starts with a byte order mark (#$EF #$BB #$BF) the byte order mark is skipped.

Parameters

w

The UTF-8 string to be converted.

Return Value

The content of 's' as an UTF-16BE wideString starting with a byte order mark.

Exceptions

EConvertError

This Delphi exception is raised if 's' contains an invalid UTF-8 sequence.

UTF16BEToUTF8Str

function UTF16BEToUTF8Str(const WS: WideString): string;

Converts an UTF-16BE widestring into an UTF-8 encoded string. The implementation is optimized for code that contains mainly ASCII characters (<=#$7F) and little above ASCII-chars. The buffer for the Result is set to the wideStrings-length. With each non-ASCII character the Result-buffer is expanded (by the Insert-function), which leads to performance problems when one processes e.g. mainly Japanese documents. If 'WS' starts with a byte order mark (#$FEFF) the byte order mark is skipped.

Parameters

WS

The UTF-16BE wideString to be converted.

Return Value

The content of 'WS' as an UTF-8 string.

Exceptions

EConvertError

This Delphi exception is raised if 'WS' contains an invalid UTF-16BE sequence.

General Character Conversion Functions to UTF-16BE

SingleByteEncodingToUTF16Char

function SingleByteEncodingToUTF16Char(const W: word; const Encoding: TdomEncodingType): WideChar;

Converts a single byte character of the specified encoding into an UTF-16BE wideChar.

Parameters

W

The code point of the single byte character to be converted.

Encoding

The encoding of the character to be converted.

Return Value

The equivalent of 'W' as an UTF-16BE WideChar.

Exceptions

EConvertError

Raised if 'encoding' is not a single byte encoding or 'W' is an invalid character in the specified encoding.

Special Character Conversion Functions to UTF-16BE

The Unicode Converter Library contains more than 70 functions for character conversion from single byte encoding schemata to UTF-16BE. All these functions share the same structure which is as follows:

function ...ToUTF16Char(const W: word): WideChar;

Parameters

W

The code point of the single byte character to be converted.

Return Value

The equivalent of 'W' as an UTF-16BE WideChar.

Exceptions

EConvertError

Raised if 'W' is an invalid code point in the source encoding scheme.

US_ASCIIToUTF16Char

function US_ASCIIToUTF16Char(const W: word): WideChar;

Iso8859_1ToUTF16Char

function Iso8859_1ToUTF16Char(const W: word): WideChar;

Iso8859_2ToUTF16Char

function Iso8859_2ToUTF16Char(const W: word): WideChar;

Iso8859_3ToUTF16Char

function Iso8859_3ToUTF16Char(const W: word): WideChar;

Iso8859_4ToUTF16Char

function Iso8859_4ToUTF16Char(const W: word): WideChar;

Iso8859_5ToUTF16Char

function Iso8859_5ToUTF16Char(const W: word): WideChar;

Iso8859_6ToUTF16Char

function Iso8859_6ToUTF16Char(const W: word): WideChar;

Iso8859_7ToUTF16Char

function Iso8859_7ToUTF16Char(const W: word): WideChar;

Iso8859_8ToUTF16Char

function Iso8859_8ToUTF16Char(const W: word): WideChar;

Iso8859_9ToUTF16Char

function Iso8859_9ToUTF16Char(const W: word): WideChar;

Iso8859_10ToUTF16Char

function Iso8859_10ToUTF16Char(const W: word): WideChar;

Iso8859_13ToUTF16Char

function Iso8859_13ToUTF16Char(const W: word): WideChar;

Iso8859_14ToUTF16Char

function Iso8859_14ToUTF16Char(const W: word): WideChar;

Iso8859_15ToUTF16Char

function Iso8859_15ToUTF16Char(const W: word): WideChar;

KOI8_RToUTF16Char

function KOI8_RToUTF16Char(const W: word): WideChar;

JIS_X0201ToUTF16Char

function JIS_X0201ToUTF16Char(const W: word): WideChar;

nextStepToUTF16Char

function nextStepToUTF16Char(const W: word): WideChar;

cp10000_MacRomanToUTF16Char

function cp10000_MacRomanToUTF16Char(const W: word): WideChar;

cp10006_MacGreekToUTF16Char

function cp10006_MacGreekToUTF16Char(const W: word): WideChar;

cp10007_MacCyrillicToUTF16Char

function cp10007_MacCyrillicToUTF16Char(const W: word): WideChar;

cp10029_MacLatin2ToUTF16Char

function cp10029_MacLatin2ToUTF16Char(const W: word): WideChar;

cp10079_MacIcelandicToUTF16Char

function cp10079_MacIcelandicToUTF16Char(const W: word): WideChar;

cp10081_MacTurkishToUTF16Char

function cp10081_MacTurkishToUTF16Char(const W: word): WideChar;

cp037ToUTF16Char

function cp037ToUTF16Char(const W: word): WideChar;

cp424ToUTF16Char

function cp424ToUTF16Char(const W: word): WideChar;

cp437ToUTF16Char

function cp437ToUTF16Char(const W: word): WideChar;

cp437_DOSLatinUSToUTF16Char

function cp437_DOSLatinUSToUTF16Char(const W: word): WideChar;

cp500ToUTF16Char

function cp500ToUTF16Char(const W: word): WideChar;

cp737_DOSGreekToUTF16Char

function cp737_DOSGreekToUTF16Char(const W: word): WideChar;

cp775_DOSBaltRimToUTF16Char

function cp775_DOSBaltRimToUTF16Char(const W: word): WideChar;

cp850ToUTF16Char

function cp850ToUTF16Char(const W: word): WideChar;

cp850_DOSLatin1ToUTF16Char

function cp850_DOSLatin1ToUTF16Char(const W: word): WideChar;

cp852ToUTF16Char

function cp852ToUTF16Char(const W: word): WideChar;

cp852_DOSLatin2ToUTF16Char

function cp852_DOSLatin2ToUTF16Char(const W: word): WideChar;

cp855ToUTF16Char

function cp855ToUTF16Char(const W: word): WideChar;

cp855_DOSCyrillicToUTF16Char

function cp855_DOSCyrillicToUTF16Char(const W: word): WideChar;

cp856_Hebrew_PCToUTF16Char

function cp856_Hebrew_PCToUTF16Char(const W: word): WideChar;

cp857ToUTF16Char

function cp857ToUTF16Char(const W: word): WideChar;

cp857_DOSTurkishToUTF16Char

function cp857_DOSTurkishToUTF16Char(const W: word): WideChar;

cp860ToUTF16Char

function cp860ToUTF16Char(const W: word): WideChar;

cp860_DOSPortugueseToUTF16Char

function cp860_DOSPortugueseToUTF16Char(const W: word): WideChar;

cp861ToUTF16Char

function cp861ToUTF16Char(const W: word): WideChar;

cp861_DOSIcelandicToUTF16Char

function cp861_DOSIcelandicToUTF16Char(const W: word): WideChar;

cp862ToUTF16Char

function cp862ToUTF16Char(const W: word): WideChar;

cp862_DOSHebrewToUTF16Char

function cp862_DOSHebrewToUTF16Char(const W: word): WideChar;

cp863ToUTF16Char

function cp863ToUTF16Char(const W: word): WideChar;

cp863_DOSCanadaFToUTF16Char

function cp863_DOSCanadaFToUTF16Char(const W: word): WideChar;

cp864ToUTF16Char

function cp864ToUTF16Char(const W: word): WideChar;

cp864_DOSArabicToUTF16Char

function cp864_DOSArabicToUTF16Char(const W: word): WideChar;

cp865ToUTF16Char

function cp865ToUTF16Char(const W: word): WideChar;

cp865_DOSNordicToUTF16Char

function cp865_DOSNordicToUTF16Char(const W: word): WideChar;

cp866ToUTF16Char

function cp866ToUTF16Char(const W: word): WideChar;

cp866_DOSCyrillicRussianToUTF16Char

function cp866_DOSCyrillicRussianToUTF16Char(const W: word): WideChar;

cp869ToUTF16Char

function cp869ToUTF16Char(const W: word): WideChar;

cp869_DOSGreek2ToUTF16Char

function cp869_DOSGreek2ToUTF16Char(const W: word): WideChar;

cp874ToUTF16Char

function cp874ToUTF16Char(const W: word): WideChar;

cp875ToUTF16Char

function cp875ToUTF16Char(const W: word): WideChar;

cp1006ToUTF16Char

function cp1006ToUTF16Char(const W: word): WideChar;

cp1026ToUTF16Char

function cp1026ToUTF16Char(const W: word): WideChar;

cp1250ToUTF16Char

function cp1250ToUTF16Char(const W: word): WideChar;

cp1251ToUTF16Char

function cp1251ToUTF16Char(const W: word): WideChar;

cp1252ToUTF16Char

function cp1252ToUTF16Char(const W: word): WideChar;

cp1253ToUTF16Char

function cp1253ToUTF16Char(const W: word): WideChar;

cp1254ToUTF16Char

function cp1254ToUTF16Char(const W: word): WideChar;

cp1255ToUTF16Char

function cp1255ToUTF16Char(const W: word): WideChar;

cp1256ToUTF16Char

function cp1256ToUTF16Char(const W: word): WideChar;

cp1257ToUTF16Char

function cp1257ToUTF16Char(const W: word): WideChar;

cp1258ToUTF16Char

function cp1258ToUTF16Char(const W: word): WideChar;

String Conversion Functions to UTF-16BE

The Unicode Converter Library contains more than 70 functions for string conversion from single byte encoding schemata to UTF-16BE. All these functions share the same structure which is as follows:

function ...ToUTF16Str(const S: string): WideString;

Parameters

S

The single byte encoded string to be converted.

Return Value

The equivalent of 'S' as an UTF-16BE WideString.

Exceptions

EConvertError

Raised if 'S' contains an invalid code point in the source encoding scheme.

US_ASCIIToUTF16Str

function US_ASCIIToUTF16Str(const S: string): WideString;

Iso8859_1ToUTF16Str

function Iso8859_1ToUTF16Str(const S: string): WideString;

Iso8859_2ToUTF16Str

function Iso8859_2ToUTF16Str(const S: string): WideString;

Iso8859_3ToUTF16Str

function Iso8859_3ToUTF16Str(const S: string): WideString;

Iso8859_4ToUTF16Str

function Iso8859_4ToUTF16Str(const S: string): WideString;

Iso8859_5ToUTF16Str

function Iso8859_5ToUTF16Str(const S: string): WideString;

Iso8859_6ToUTF16Str

function Iso8859_6ToUTF16Str(const S: string): WideString;

Iso8859_7ToUTF16Str

function Iso8859_7ToUTF16Str(const S: string): WideString;

Iso8859_8ToUTF16Str

function Iso8859_8ToUTF16Str(const S: string): WideString;

Iso8859_9ToUTF16Str

function Iso8859_9ToUTF16Str(const S: string): WideString;

Iso8859_10ToUTF16Str

function Iso8859_10ToUTF16Str(const S: string): WideString;

Iso8859_13ToUTF16Str

function Iso8859_13ToUTF16Str(const S: string): WideString;

Iso8859_14ToUTF16Str

function Iso8859_14ToUTF16Str(const S: string): WideString;

Iso8859_15ToUTF16Str

function Iso8859_15ToUTF16Str(const S: string): WideString;

KOI8_RToUTF16Str

function KOI8_RToUTF16Str(const S: string): WideString;

JIS_X0201ToUTF16Str

function JIS_X0201ToUTF16Str(const S: string): WideString;

nextStepToUTF16Str

function nextStepToUTF16Str(const S: string): WideString;

cp10000_MacRomanToUTF16Str

function cp10000_MacRomanToUTF16Str(const S: string): WideString;

cp10006_MacGreekToUTF16Str

function cp10006_MacGreekToUTF16Str(const S: string): WideString;

cp10007_MacCyrillicToUTF16Str

function cp10007_MacCyrillicToUTF16Str(const S: string): WideString;

cp10029_MacLatin2ToUTF16Str

function cp10029_MacLatin2ToUTF16Str(const S: string): WideString;

cp10079_MacIcelandicToUTF16Str

function cp10079_MacIcelandicToUTF16Str(const S: string): WideString;

cp10081_MacTurkishToUTF16Str

function cp10081_MacTurkishToUTF16Str(const S: string): WideString;

cp037ToUTF16Str

function cp037ToUTF16Str(const S: string): WideString;

cp424ToUTF16Str

function cp424ToUTF16Str(const S: string): WideString;

cp437ToUTF16Str

function cp437ToUTF16Str(const S: string): WideString;

cp437_DOSLatinUSToUTF16Str

function cp437_DOSLatinUSToUTF16Str(const S: string): WideString;

cp500ToUTF16Str

function cp500ToUTF16Str(const S: string): WideString;

cp737_DOSGreekToUTF16Str

function cp737_DOSGreekToUTF16Str(const S: string): WideString;

cp775_DOSBaltRimToUTF16Str

function cp775_DOSBaltRimToUTF16Str(const S: string): WideString;

cp850ToUTF16Str

function cp850ToUTF16Str(const S: string): WideString;

cp850_DOSLatin1ToUTF16Str

function cp850_DOSLatin1ToUTF16Str(const S: string): WideString;

cp852ToUTF16Str

function cp852ToUTF16Str(const S: string): WideString;

cp852_DOSLatin2ToUTF16Str

function cp852_DOSLatin2ToUTF16Str(const S: string): WideString;

cp855ToUTF16Str

function cp855ToUTF16Str(const S: string): WideString;

cp855_DOSCyrillicToUTF16Str

function cp855_DOSCyrillicToUTF16Str(const S: string): WideString;

cp856_Hebrew_PCToUTF16Str

function cp856_Hebrew_PCToUTF16Str(const S: string): WideString;

cp857ToUTF16Str

function cp857ToUTF16Str(const S: string): WideString;

cp857_DOSTurkishToUTF16Str

function cp857_DOSTurkishToUTF16Str(const S: string): WideString;

cp860ToUTF16Str

function cp860ToUTF16Str(const S: string): WideString;

cp860_DOSPortugueseToUTF16Str

function cp860_DOSPortugueseToUTF16Str(const S: string): WideString;

cp861ToUTF16Str

function cp861ToUTF16Str(const S: string): WideString;

cp861_DOSIcelandicToUTF16Str

function cp861_DOSIcelandicToUTF16Str(const S: string): WideString;

cp862ToUTF16Str

function cp862ToUTF16Str(const S: string): WideString;

cp862_DOSHebrewToUTF16Str

function cp862_DOSHebrewToUTF16Str(const S: string): WideString;

cp863ToUTF16Str

function cp863ToUTF16Str(const S: string): WideString;

cp863_DOSCanadaFToUTF16Str

function cp863_DOSCanadaFToUTF16Str(const S: string): WideString;

cp864ToUTF16Str

function cp864ToUTF16Str(const S: string): WideString;

cp864_DOSArabicToUTF16Str

function cp864_DOSArabicToUTF16Str(const S: string): WideString;

cp865ToUTF16Str

function cp865ToUTF16Str(const S: string): WideString;

cp865_DOSNordicToUTF16Str

function cp865_DOSNordicToUTF16Str(const S: string): WideString;

cp866ToUTF16Str

function cp866ToUTF16Str(const S: string): WideString;

cp866_DOSCyrillicRussianToUTF16Str

function cp866_DOSCyrillicRussianToUTF16Str(const S: string): WideString;

cp869ToUTF16Str

function cp869ToUTF16Str(const S: string): WideString;

cp869_DOSGreek2ToUTF16Str

function cp869_DOSGreek2ToUTF16Str(const S: string): WideString;

cp874ToUTF16Str

function cp874ToUTF16Str(const S: string): WideString;

cp875ToUTF16Str

function cp875ToUTF16Str(const S: string): WideString;

cp1006ToUTF16Str

function cp1006ToUTF16Str(const S: string): WideString;

cp1026ToUTF16Str

function cp1026ToUTF16Str(const S: string): WideString;

cp1250ToUTF16Str

function cp1250ToUTF16Str(const S: string): WideString;

cp1251ToUTF16Str

function cp1251ToUTF16Str(const S: string): WideString;

cp1252ToUTF16Str

function cp1252ToUTF16Str(const S: string): WideString;

cp1253ToUTF16Str

function cp1253ToUTF16Str(const S: string): WideString;

cp1254ToUTF16Str

function cp1254ToUTF16Str(const S: string): WideString;

cp1255ToUTF16Str

function cp1255ToUTF16Str(const S: string): WideString;

cp1256ToUTF16Str

function cp1256ToUTF16Str(const S: string): WideString;

cp1257ToUTF16Str

function cp1257ToUTF16Str(const S: string): WideString;

cp1258ToUTF16Str

function cp1258ToUTF16Str(const S: string): WideString;

Character Conversion Functions from UTF-16

The Unicode Converter Library contains more than 70 functions for character conversion from UTF-16 to single byte encoding schemata. All these functions share the same structure which is as follows:

function UTF16To...Char(const I: longint): Char;

Parameters

I

The code point of the UTF-16 character to be converted.

Return Value

The equivalent of 'I' as a Char in the specified encoding.

Exceptions

EConvertError

Raised if 'I' has no equivalent code point in the target encoding scheme.

UTF16ToUS_ASCIIChar

function UTF16ToUS_ASCIIChar(const I: longint): Char;

UTF16ToIso8859_1Char

function UTF16ToIso8859_1Char(const I: longint): Char;

UTF16ToIso8859_2Char

function UTF16ToIso8859_2Char(const I: longint): Char;

UTF16ToIso8859_3Char

function UTF16ToIso8859_3Char(const I: longint): Char;

UTF16ToIso8859_4Char

function UTF16ToIso8859_4Char(const I: longint): Char;

UTF16ToIso8859_5Char

function UTF16ToIso8859_5Char(const I: longint): Char;

UTF16ToIso8859_6Char

function UTF16ToIso8859_6Char(const I: longint): Char;

UTF16ToIso8859_7Char

function UTF16ToIso8859_7Char(const I: longint): Char;

UTF16ToIso8859_8Char

function UTF16ToIso8859_8Char(const I: longint): Char;

UTF16ToIso8859_9Char

function UTF16ToIso8859_9Char(const I: longint): Char;

UTF16ToIso8859_10Char

function UTF16ToIso8859_10Char(const I: longint): Char;

UTF16ToIso8859_13Char

function UTF16ToIso8859_13Char(const I: longint): Char;

UTF16ToIso8859_14Char

function UTF16ToIso8859_14Char(const I: longint): Char;

UTF16ToIso8859_15Char

function UTF16ToIso8859_15Char(const I: longint): Char;

UTF16ToKOI8_RChar

function UTF16ToKOI8_RChar(const I: longint): Char;

UTF16ToJIS_X0201Char

function UTF16ToJIS_X0201Char(const I: longint): Char;

UTF16ToNextStepChar

function UTF16ToNextStepChar(const I: longint): Char;

UTF16ToCp10000_MacRomanChar

function UTF16ToCp10000_MacRomanChar(const I: longint): Char;

UTF16ToCp10006_MacGreekChar

function UTF16ToCp10006_MacGreekChar(const I: longint): Char;

UTF16ToCp10007_MacCyrillicChar

function UTF16ToCp10007_MacCyrillicChar(const I: longint): Char;

UTF16ToCp10029_MacLatin2Char

function UTF16ToCp10029_MacLatin2Char(const I: longint): Char;

UTF16ToCp10079_MacIcelandicChar

function UTF16ToCp10079_MacIcelandicChar(const I: longint): Char;

UTF16ToCp10081_MacTurkishChar

function UTF16ToCp10081_MacTurkishChar(const I: longint): Char;

UTF16ToCp037Char

function UTF16ToCp037Char(const I: longint): Char;

UTF16ToCp424Char

function UTF16ToCp424Char(const I: longint): Char;

UTF16ToCp437Char

function UTF16ToCp437Char(const I: longint): Char;

UTF16ToCp437_DOSLatinUSChar

function UTF16ToCp437_DOSLatinUSChar(const I: longint): Char;

UTF16ToCp500Char

function UTF16ToCp500Char(const I: longint): Char;

UTF16ToCp737_DOSGreekChar

function UTF16ToCp737_DOSGreekChar(const I: longint): Char;

UTF16ToCp775_DOSBaltRimChar

function UTF16ToCp775_DOSBaltRimChar(const I: longint): Char;

UTF16ToCp850Char

function UTF16ToCp850Char(const I: longint): Char;

UTF16ToCp850_DOSLatin1Char

function UTF16ToCp850_DOSLatin1Char(const I: longint): Char;

UTF16ToCp852Char

function UTF16ToCp852Char(const I: longint): Char;

UTF16ToCp852_DOSLatin2Char

function UTF16ToCp852_DOSLatin2Char(const I: longint): Char;

UTF16ToCp855Char

function UTF16ToCp855Char(const I: longint): Char;

UTF16ToCp855_DOSCyrillicChar

function UTF16ToCp855_DOSCyrillicChar(const I: longint): Char;

UTF16ToCp856_Hebrew_PCChar

function UTF16ToCp856_Hebrew_PCChar(const I: longint): Char;

UTF16ToCp857Char

function UTF16ToCp857Char(const I: longint): Char;

UTF16ToCp857_DOSTurkishChar

function UTF16ToCp857_DOSTurkishChar(const I: longint): Char;

UTF16ToCp860Char

function UTF16ToCp860Char(const I: longint): Char;

UTF16ToCp860_DOSPortugueseChar

function UTF16ToCp860_DOSPortugueseChar(const I: longint): Char;

UTF16ToCp861Char

function UTF16ToCp861Char(const I: longint): Char;

UTF16ToCp861_DOSIcelandicChar

function UTF16ToCp861_DOSIcelandicChar(const I: longint): Char;

UTF16ToCp862Char

function UTF16ToCp862Char(const I: longint): Char;

UTF16ToCp862_DOSHebrewChar

function UTF16ToCp862_DOSHebrewChar(const I: longint): Char;

UTF16ToCp863Char

function UTF16ToCp863Char(const I: longint): Char;

UTF16ToCp863_DOSCanadaFChar

function UTF16ToCp863_DOSCanadaFChar(const I: longint): Char;

UTF16ToCp864Char

function UTF16ToCp864Char(const I: longint): Char;

UTF16ToCp864_DOSArabicChar

function UTF16ToCp864_DOSArabicChar(const I: longint): Char;

UTF16ToCp865Char

function UTF16ToCp865Char(const I: longint): Char;

UTF16ToCp865_DOSNordicChar

function UTF16ToCp865_DOSNordicChar(const I: longint): Char;

UTF16ToCp866Char

function UTF16ToCp866Char(const I: longint): Char;

UTF16ToCp866_DOSCyrillicRussianChar

function UTF16ToCp866_DOSCyrillicRussianChar(const I: longint): Char;

UTF16ToCp869Char

function UTF16ToCp869Char(const I: longint): Char;

UTF16ToCp869_DOSGreek2Char

function UTF16ToCp869_DOSGreek2Char(const I: longint): Char;

UTF16ToCp874Char

function UTF16ToCp874Char(const I: longint): Char;

UTF16ToCp875Char

function UTF16ToCp875Char(const I: longint): Char;

UTF16ToCp1006Char

function UTF16ToCp1006Char(const I: longint): Char;

UTF16ToCp1026Char

function UTF16ToCp1026Char(const I: longint): Char;

UTF16ToCp1250Char

function UTF16ToCp1250Char(const I: longint): Char;

UTF16ToCp1251Char

function UTF16ToCp1251Char(const I: longint): Char;

UTF16ToCp1252Char

function UTF16ToCp1252Char(const I: longint): Char;

UTF16ToCp1253Char

function UTF16ToCp1253Char(const I: longint): Char;

UTF16ToCp1254Char

function UTF16ToCp1254Char(const I: longint): Char;

UTF16ToCp1255Char

function UTF16ToCp1255Char(const I: longint): Char;

UTF16ToCp1256Char

function UTF16ToCp1256Char(const I: longint): Char;

UTF16ToCp1257Char

function UTF16ToCp1257Char(const I: longint): Char;

UTF16ToCp1258Char

function UTF16ToCp1258Char(const I: longint): Char;

String Conversion Functions from UTF-16BE

The Unicode Converter Library contains more than 70 functions for string conversion from UTF-16 to single byte encoding schemata. All these functions share the same structure which is as follows:

function UTF16To...Str(const S: WideString): string;

Parameters

S

The UTF-16 WideString to be converted.

Return Value

The equivalent of 'S' as a string in the specified encoding.

Exceptions

EConvertError

Raised if a code point in 'S' has no equivalent code point in the target encoding scheme.

UTF16ToUS_ASCIIStr

function UTF16ToUS_ASCIIStr(const S: WideString): string;

UTF16ToIso8859_1Str

function UTF16ToIso8859_1Str(const S: WideString): string;

UTF16ToIso8859_2Str

function UTF16ToIso8859_2Str(const S: WideString): string;

UTF16ToIso8859_3Str

function UTF16ToIso8859_3Str(const S: WideString): string;

UTF16ToIso8859_4Str

function UTF16ToIso8859_4Str(const S: WideString): string;

UTF16ToIso8859_5Str

function UTF16ToIso8859_5Str(const S: WideString): string;

UTF16ToIso8859_6Str

function UTF16ToIso8859_6Str(const S: WideString): string;

UTF16ToIso8859_7Str

function UTF16ToIso8859_7Str(const S: WideString): string;

UTF16ToIso8859_8Str

function UTF16ToIso8859_8Str(const S: WideString): string;

UTF16ToIso8859_9Str

function UTF16ToIso8859_9Str(const S: WideString): string;

UTF16ToIso8859_10Str

function UTF16ToIso8859_10Str(const S: WideString): string;

UTF16ToIso8859_13Str

function UTF16ToIso8859_13Str(const S: WideString): string;

UTF16ToIso8859_14Str

function UTF16ToIso8859_14Str(const S: WideString): string;

UTF16ToIso8859_15Str

function UTF16ToIso8859_15Str(const S: WideString): string;

UTF16ToKOI8_RStr

function UTF16ToKOI8_RStr(const S: WideString): string;

UTF16ToJIS_X0201Str

function UTF16ToJIS_X0201Str(const S: WideString): string;

UTF16ToNextStepStr

function UTF16ToNextStepStr(const S: WideString): string;

UTF16ToCp10000_MacRomanStr

function UTF16ToCp10000_MacRomanStr(const S: WideString): string;

UTF16ToCp10006_MacGreekStr

function UTF16ToCp10006_MacGreekStr(const S: WideString): string;

UTF16ToCp10007_MacCyrillicStr

function UTF16ToCp10007_MacCyrillicStr(const S: WideString): string;

UTF16ToCp10029_MacLatin2Str

function UTF16ToCp10029_MacLatin2Str(const S: WideString): string;

UTF16ToCp10079_MacIcelandicStr

function UTF16ToCp10079_MacIcelandicStr(const S: WideString): string;

UTF16ToCp10081_MacTurkishStr

function UTF16ToCp10081_MacTurkishStr(const S: WideString): string;

UTF16ToCp037Str

function UTF16ToCp037Str(const S: WideString): string;

UTF16ToCp424Str

function UTF16ToCp424Str(const S: WideString): string;

UTF16ToCp437Str

function UTF16ToCp437Str(const S: WideString): string;

UTF16ToCp437_DOSLatinUSStr

function UTF16ToCp437_DOSLatinUSStr(const S: WideString): string;

UTF16ToCp500Str

function UTF16ToCp500Str(const S: WideString): string;

UTF16ToCp737_DOSGreekStr

function UTF16ToCp737_DOSGreekStr(const S: WideString): string;

UTF16ToCp775_DOSBaltRimStr

function UTF16ToCp775_DOSBaltRimStr(const S: WideString): string;

UTF16ToCp850Str

function UTF16ToCp850Str(const S: WideString): string;

UTF16ToCp850_DOSLatin1Str

function UTF16ToCp850_DOSLatin1Str(const S: WideString): string;

UTF16ToCp852Str

function UTF16ToCp852Str(const S: WideString): string;

UTF16ToCp852_DOSLatin2Str

function UTF16ToCp852_DOSLatin2Str(const S: WideString): string;

UTF16ToCp855Str

function UTF16ToCp855Str(const S: WideString): string;

UTF16ToCp855_DOSCyrillicStr

function UTF16ToCp855_DOSCyrillicStr(const S: WideString): string;

UTF16ToCp856_Hebrew_PCStr

function UTF16ToCp856_Hebrew_PCStr(const S: WideString): string;

UTF16ToCp857Str

function UTF16ToCp857Str(const S: WideString): string;

UTF16ToCp857_DOSTurkishStr

function UTF16ToCp857_DOSTurkishStr(const S: WideString): string;

UTF16ToCp860Str

function UTF16ToCp860Str(const S: WideString): string;

UTF16ToCp860_DOSPortugueseStr

function UTF16ToCp860_DOSPortugueseStr(const S: WideString): string;

UTF16ToCp861Str

function UTF16ToCp861Str(const S: WideString): string;

UTF16ToCp861_DOSIcelandicStr

function UTF16ToCp861_DOSIcelandicStr(const S: WideString): string;

UTF16ToCp862Str

function UTF16ToCp862Str(const S: WideString): string;

UTF16ToCp862_DOSHebrewStr

function UTF16ToCp862_DOSHebrewStr(const S: WideString): string;

UTF16ToCp863Str

function UTF16ToCp863Str(const S: WideString): string;

UTF16ToCp863_DOSCanadaFStr

function UTF16ToCp863_DOSCanadaFStr(const S: WideString): string;

UTF16ToCp864Str

function UTF16ToCp864Str(const S: WideString): string;

UTF16ToCp864_DOSArabicStr

function UTF16ToCp864_DOSArabicStr(const S: WideString): string;

UTF16ToCp865Str

function UTF16ToCp865Str(const S: WideString): string;

UTF16ToCp865_DOSNordicStr

function UTF16ToCp865_DOSNordicStr(const S: WideString): string;

UTF16ToCp866Str

function UTF16ToCp866Str(const S: WideString): string;

UTF16ToCp866_DOSCyrillicRussianStr

function UTF16ToCp866_DOSCyrillicRussianStr(const S: WideString): string;

UTF16ToCp869Str

function UTF16ToCp869Str(const S: WideString): string;

UTF16ToCp869_DOSGreek2Str

function UTF16ToCp869_DOSGreek2Str(const S: WideString): string;

UTF16ToCp874Str

function UTF16ToCp874Str(const S: WideString): string;

UTF16ToCp875Str

function UTF16ToCp875Str(const S: WideString): string;

UTF16ToCp1006Str

function UTF16ToCp1006Str(const S: WideString): string;

UTF16ToCp1026Str

function UTF16ToCp1026Str(const S: WideString): string;

UTF16ToCp1250Str

function UTF16ToCp1250Str(const S: WideString): string;

UTF16ToCp1251Str

function UTF16ToCp1251Str(const S: WideString): string;

UTF16ToCp1252Str

function UTF16ToCp1252Str(const S: WideString): string;

UTF16ToCp1253Str

function UTF16ToCp1253Str(const S: WideString): string;

UTF16ToCp1254Str

function UTF16ToCp1254Str(const S: WideString): string;

UTF16ToCp1255Str

function UTF16ToCp1255Str(const S: WideString): string;

UTF16ToCp1256Str

function UTF16ToCp1256Str(const S: WideString): string;

UTF16ToCp1257Str

function UTF16ToCp1257Str(const S: WideString): string;

UTF16ToCp1258Str

function UTF16ToCp1258Str(const S: WideString): string;

CSMIB classes

These classes are designed for easy access to the Management Information Base (MIB) for character set encodings as specified in [CSMIB].

CSMIB Exception Classes

ECSMIBException = Exception;

ECSMIBException is the exception class for errors in the TCSMIB class.

CSMIB Event Classes

TCSMIBChangingEvent = procedure (Sender: TObject;
                                  NewEnum: integer;
                                  var AllowChange: Boolean) of object;

This event class is used for the TCSMIB.OnChanging event.

The TCSMIB Class

The TCSMIB component decodes MIB enumaration (MIBenum) values which identify coded character sets as specified in [CSMIB].

Published Properties

property Enum: integer

The Enum property contains the unique MIB enum value to identify a coded character set.

If on setting an invalid value is specified then, depending on the value of the IgnoreInvalidEnum property, either an ECSMIBException is raised or the attempt is silently ignored.

property IgnoreInvalidEnum: boolean

If set to TRUE an attempt to set the Enum property to an invalid value is silently ignored, i.e. Enum will remain the same and no notification about the failure is made.

If set to FALSE an attempt to set the Enum property to an invalid value results in an ECSMIBException being raised.

Public Properties

property Alias[i: integer]: string (readonly)

This property gives access to a list of official names for the character set with the MIB enum value specified in the Enum property.

These names are expressed in ANSI_X3.4-1968, also known as US-ASCII or simply ASCII. The names are not case-sensitive.

The aliases that start with "cs" have been added for use with the Printer MIB (see RFC 1759) and contain the standard numbers along with suggestive names in order to facilitate applications that want to display the names in user interfaces. The "cs" stands for character set and is provided for applications that need a lower case first letter but wan to use mixed case thereafter that cannot contain any special characters, such as underbar ("_") and dash ("-").

The i parameter corresponds to the position of an alias in the list, where 0 is the first alias, 1 is the second alias, and so on. If there is no alias corresponding to the value of i, an ECSMIBException is raised.

The first alias, i.e. Alias[0], always contains the MIB name of the corresponding character set.

property AliasCount: integer (readonly)

This property represents the number of aliases in the list for the names of the coded character set with the MIB enum value specified in the Enum property.

Use the AliasCount property when iterating over all the aliases in the list, or when trying to locate the position of an alias relative to the last alias in the list.

property PreferredMIMEName: string (readonly)

This property contains the preferred MIME Name, if any, of the coded character set with the MIB enum value specified in the Enum property. If no such preferred MIME name is specified this is an empty string.

Public Methods

function IsValidEnum(const Value: integer): boolean; virtual;

Returns TRUE if the specified value is a valid MIBenum value. Otherwise FALSE is returned.

function SetToAlias(const S: string): boolean; virtual;

Tries to set the Enum property to a value corresponding to the specified string. No distinction is made between use of upper and lower case letters. If the attempt was successful, TRUE is return. Otherwise False is returnd and the value of the Enum property remains the same.

Events

OnChange: TNotifyEvent

Occures after the Enum property changed.

type TNotifyEvent = procedure (Sender: TObject) of object;

OnChanging: TCSMIBChangingEvent

Occures just before a change is made to the Enum property.

TCSMIBChangingEvent = procedure (Sender: TObject; NewEnum: integer; var AllowChange: Boolean) of object;

Write an OnChanging event handler to conditionally block changes to the Enum property. Set the AllowChange parameter to FALSE to prevent the change from taking place. The NewEnum parameter is the new Enum value about to be set.

References

[CSMIB] IANA: Character Sets, 2001-08-23, see: "http://www.iana.org/assignments/character-sets".

[ISO/IEC 10646] ISO (International Organization for Standardization): ISO/IEC 10646-1993 (E). Information technology – Universal Multiple-Octet Coded Character Set (UCS) – Part 1: Architecture and Basic Multilingual Plane, [Geneva]: International Organization for Standardization, 1993 (+ amendments AM 1–7).

[RFC 2279] Yergeau, F.: "UTF-8, a Transformation Format of ISO 10646", RFC 2279, 1998, see "http://www.ietf.org/rfc/rfc2279.txt".

[RFC 2781] Hoffman, P. and F. Yergeau: "UTF-16, an Encoding of ISO 10646", RFC 2781, 2000, see "http://www.ietf.org/rfc/rfc2781.txt".

[Unicode 3.0] The Unicode Consortium: The Unicode Standard Version 3.0, Reading (Mass.): Addison-Wesley, 2000.