Data Transfer

Data Transfer is a set of tools for viewing and modifying Charset and Encoding.

Figure 174. Data Transfer

Data Transfer

Charset

The character set conversion can be enabled at the data source level for each application if you specify the input Charset and database Charset for the cache in AEM.

You will find the Charset tool useful when:

  • The database uses non-UTF-8 character set, and

  • The language display of the application has error code in it

Otherwise, it is unnecessary to use this tool.

Figure 175. Charset settings

Charset settings

Configuring database Charset for a data source

Step 1: Click an application in the Application Name column. The Configure Charset window opens.

Figure 176. Configure charset settings

Configure charset settings

Step 2: Click the Add Charset button. The Add Charset window opens.

Figure 177. Add charset settings

Add charset settings

Step 3: Select the data source from the Data Source dropdown list.

Step 4: Select the database charset type from the Database Charset dropdown list. The Charset should be consistent with the Charset used in the database. This will not change the setting in the database.

Step 5: Select the input charset type from the Client-side Charset dropdown list. This setting should match the input Charset type at the client side.

Step 6: Click the Save button to confirm the configuration or Save and Add to add another one.

Updating a Charset

On the Charset page, click Update under Actions to update a Charset for a data source.

Deleting a Charset

On the Charset page, click Delete under Actions to delete an unwanted Charset for a data source.

Charset options given in the Charset fields

The following table lists all the Charset options provided in the "Database Charset" field and the "Client-side Charset" field, and provides a brief description of each Charset. If the actual database Charset or the input Charset is not provided as an option, you can use the following method to manually add the type as an option:

Step 1: Open the file constant.config in the directory %appeonserver%\AEM\config\.

Step 2: Add the Charset type as an entry into the file, and save the file.

For example, if the Charset type that you want to add is "gbk", you can add a new line <charset name="gbk" value="gbk"></charset> in the file.

Step 3: Restart Appeon Server and the "gbk" Charset will be added to the Charset lists.

The following table lists the character sets and code pages. The asterisk (*) at the last column indicates that Microsoft .NET Framework supports the code page, regardless of the platform.

Table 44. Charset and code pages

Page

Charset

Description

Asterisk

37

IBM037

IBM EBCDIC (US - Canada)

 

437

IBM437

OEM US

 

500

IBM500

IBM EBCDIC (International)

 

708

ASMO-708

Arabic (ASMO 708)

 

720

DOS-720

Arabic (DOS)

 

737

ibm737

Greek (DOS)

 

775

ibm775

Baltic (DOS)

 

850

ibm850

Western European (DOS)

 

852

ibm852

Central European (DOS)

 

855

IBM855

OEM Cyrillic

 

857

ibm857

Turkish (DOS)

 

858

IBM00858

OEM Multi-Language Latin I

 

860

IBM860

Portuguese (DOS)

 

861

ibm861

Iceland (DOS)

 

862

DOS-862

Hebrew (DOS)

 

863

IBM863

Canadian French (DOS)

 

864

IBM864

Arabic (864)

 

865

IBM865

Northern European (DOS)

 

866

cp866

Cyrillic (DOS)

 

869

ibm869

Modern Greek (DOS)

 

870

IBM870

IBM EBCDIC (Multi-Language Latin 2)

 

874

windows-874

Thai (Windows)

 

875

cp875

IBM EBCDIC (Modern Greek)

 

932

shift_jis

Japanese (Shift-JIS)

 

936

gb2312

Simplified Chinese (GB2312)

*

949

ks_c_5601-1987

Korean

 

950

big5

Traditional Chinese (Big5)

 

1026

IBM1026

IBM EBCDIC (TurkishLatin 5)

 

1047

IBM01047

IBM Latin 1

 

1140

IBM01140

IBM EBCDIC (US - Canada - Europe)

 

1141

IBM01141

IBM EBCDIC (German - Europe)

 

1142

IBM01142

IBM EBCDIC (Denmark - Norway - Europe)

 

1143

IBM01143

IBM EBCDIC (Finland - Sweden - Europe)

 

1144

IBM01144

IBM EBCDIC (Italy - Europe)

 

1145

IBM01145

IBM EBCDIC (Spain- Europe)

 

1146

IBM01146

IBM EBCDIC (U.K. - Europe)

 

1147

IBM01147

IBM EBCDIC (France - Europe)

 

1148

IBM01148

IBM EBCDIC (International - Europe)

 

1149

IBM01149

IBM EBCDIC (Iceland - Europe)

 

1200

utf-16

Unicode

*

1201

UnicodeFFFE

Unicode (Big-Endian)

*

1250

windows-1250

Central Europe (Windows)

 

1251

windows-1251

Cyrillic (Windows)

 

1252

Windows-1252

Central Europe (Windows)

*

1253

windows-1253

Greek (Windows)

 

1254

windows-1254

Turkish (Windows)

 

1255

windows-1255

Hebrew (Windows)

 

1256

windows-1256

Arabic (Windows)

 

1257

windows-1257

Baltic (Windows)

 

1258

windows-1258

Vietnamese (Windows)

 

1361

Johab

Korean (Johab)

 

10000

macintosh

Central Europe (Mac)

 

10001

x-mac-japanese

Japanese (Mac)

 

10002

x-mac-chinesetrad

Traditional Chinese (Mac)

 

10003

x-mac-korean

Korean (Mac)

*

10004

x-mac-arabic

Arabic (Mac)

 

10005

x-mac-hebrew

Hebrew (Mac)

 

10006

x-mac-greek

Greek (Mac)

 

10007

x-mac-cyrillic

Cyrillic (Mac)

 

10008

x-mac-chinesesimp

Simplified Chinese (Mac)

*

10010

x-mac-romanian

Romanian (Mac)

 

10017

x-mac-ukrainian

Ukrainian (Mac)

 

10021

x-mac-thai

Thai (Mac)

 

10029

x-mac-ce

Central Europe (Mac)

 

10079

x-mac-icelandic

Iceland (Mac)

 

10081

x-mac-turkish

Turkish (Mac)

 

10082

x-mac-croatian

Croatian (Mac)

 

20000

x-Chinese-CNS

Traditional Chinese (CNS)

 

20001

x-cp20001

TCA Taiwan

 

20002

x-Chinese-Eten

Traditional Chinese (Eten)

 

20003

x-cp20003

IBM5550 Taiwan

 

20004

x-cp20004

TeleText Taiwan

 

20005

x-cp20005

Wang Taiwan

 

20105

x-IA5

Central Europe (IA5)

 

20106

x-IA5-German

Germany (IA5)

 

20107

x-IA5-Swedish

Swedish (IA5)

 

20108

x-IA5-Norwegian

Norwegian (IA5)

 

20127

us-ascii

US-ASCII

*

20261

x-cp20261

T.61

 

20269

x-cp20269

ISO-6937

 

20273

IBM273

IBM EBCDIC (Germany)

 

20277

IBM277

IBM EBCDIC (Denmark - Norwegian)

 

20278

IBM278

IBM EBCDIC (Finland- Swedish)

 

20280

IBM280

IBM EBCDIC (Italy)

 

20284

IBM284

IBM EBCDIC (Spanish)

 

20285

IBM285

IBM EBCDIC (U.K.)

 

20290

IBM290

IBM EBCDIC (Japanese Katakana)

 

20297

IBM297

IBM EBCDIC (France)

 

20420

IBM420

IBM EBCDIC (Arabic)

 

20423

IBM423

IBM EBCDIC (Greek)

 

20424

IBM424

IBM EBCDIC (Hebrew)

 

20833

x-EBCDIC-KoreanExtended

IBM EBCDIC (Korean Extension)

 

20838

IBM-Thai

IBM EBCDIC (Thai)

 

20866

koi8-r

Cyrillic (KOI8-R)

 

20871

IBM871

IBM EBCDIC (Iceland)

 

20880

IBM880

IBM EBCDIC (Cyrillic Russian)

 

20905

IBM905

IBM EBCDIC (Turkish)

 

20924

IBM00924

IBM Latin 1

 

20932

EUC-JP

Japanese (JIS 0208-1990 and 0212-1990)

 

20936

x-cp20936

Simplified Chinese (GB2312-80)

*

20949

x-cp20949

Korean Wansung

*

21025

cp1025

IBM EBCDIC (Cyrillic Serbian - Bulgarian)

 

21866

koi8-u

Cyrillic (KOI8-U)

 

28591

iso-8859-1

Central Europe (ISO)

*

28592

iso-8859-2

Central Europe (ISO)

 

28593

iso-8859-3

Latin 3 (ISO)

 

28594

iso-8859-4

Baltic (ISO)

 

28595

iso-8859-5

Cyrillic (ISO)

 

28596

iso-8859-6

Arabic (ISO)

 

28597

iso-8859-7

Greek (ISO)

 

28598

iso-8859-8

Hebrew (ISO-Visual)

*

28599

iso-8859-9

Turkish (ISO)

 

28603

iso-8859-13

Estonian (ISO)

 

28605

iso-8859-15

Latin 9 (ISO)

 

29001

x-Europa

Europa

 

38598

iso-8859-8-i

Hebrew (ISO-Logical)

*

50220

iso-2022-jp

Japanese (JIS)

*

50221

csISO2022JP

Japanese (JIS- 1 byte Kana)

*

50222

iso-2022-jp

Japanese (JIS- 1 byte Kana - SO/SI)

*

50225

iso-2022-kr

Korean (ISO)

*

50227

x-cp50227

Simplified Chinese (ISO-2022)

*

51932

euc-jp

Japanese (EUC)

*

51936

EUC-CN

Simplified Chinese (EUC)

*

51949

euc-kr

Korean (EUC)

*

52936

hz-gb-2312

Simplified Chinese (HZ)

*

54936

GB18030

Simplified Chinese (GB18030)

*

57002

x-iscii-de

ISCII Sanskrit

*

57003

x-iscii-be

ISCII Bengalese

*

57004

x-iscii-ta

ISCII Tamil

*

57005

x-iscii-te

ISCII Telugu

*

57006

x-iscii-as

ISCII Assamese

*

57007

x-iscii-or

ISCII Oriya

*

57008

x-iscii-ka

ISCII Kannada

*

57009

x-iscii-ma

ISCII Malayalam

*

57010

x-iscii-gu

ISCII Gujarat

*

57011

x-iscii-pa

ISCII Punjab

*

65000

utf-7

Unicode (UTF-7)

*

65001

utf-8

Unicode (UTF-8)

*

65005

utf-32

Unicode (UTF-32)

*

65006

utf-32BE

Unicode (UTF-32 Big-Endian)

*


Encoding

Encoding specifies the encoding format for data transferred between the clients and the server. The transfer speed varies when the encoding format changes.

Figure 178. Encoding

Encoding

If the language of the application is pure English, select UTF-8; otherwise, select UTF-16LE.

Figure 179. Encoding settings

Encoding settings