Internationalizing an Application

About this chapter

This chapter describes some of the issues that arise when you develop and deploy applications for multiple languages.

Developing international applications

When you develop an application for deployment in multiple languages, you can take advantage of the Unicode support built into PowerBuilder. You also need to focus on two phases of the development process:

  • The first is the internationalization phase, when you deal with design issues before you begin coding the application.

  • The second is the localization phase, which starts once the development phase of an internationalized application is complete, when you deal with the translation and deployment of your application you enter the.

Using Unicode

Unicode is a character encoding scheme that enables text display for most of the world's languages. Support for Unicode characters is built into PowerBuilder. This means that you can display characters from multiple languages on the same page of your application, create a flexible user interface suitable for deployment to different countries, and process data in multiple languages.

About Unicode

Before Unicode was developed, there were many different encoding systems, many of which conflicted with each other. For example, the same number could represent different characters in different encoding systems. Unicode provides a unique number for each character in all supported written languages. For languages that can be written in several scripts, Unicode provides a unique number for each character in each supported script.

For more information about the supported languages and scripts, see the Unicode website at http://www.unicode.org/cldr/charts/latest/supplemental/scripts_and_languages.html.

Encoding forms

There are three Unicode encoding forms: UTF-8, UTF-16, and UTF-32. Originally UTF stood for Unicode Transformation Format. The acronym is used now in the names of these encoding forms, which map from a character set definition to the actual code units that represent the data, and to the encoding schemes, which are encoding forms with a specific byte serialization.

  • UTF-8 uses an unsigned byte sequence of one to four bytes to represent each Unicode character.

  • UTF-16 uses one or two unsigned 16-bit code units, depending on the range of the scalar value of the character, to represent each Unicode character.

  • UTF-32 uses a single unsigned 32-bit code unit to represent each Unicode character.

Encoding schemes

An encoding scheme specifies how the bytes in an encoding form are serialized. When you manipulate files, convert blobs and strings, and save DataWindow data in PowerBuilder, you can choose to use ANSI encoding, or one of three Unicode encoding schemes:

  • UTF-8 serializes a UTF-8 code unit sequence in exactly the same order as the code unit sequence itself.

  • UTF-16BE serializes a UTF-16 code unit sequence as a byte sequence in big-endian format.

  • UTF-16LE serializes a UTF-16 code unit sequence as a byte sequence in little-endian format.

UTF-8 is frequently used in Web requests and responses. The big-endian format, where the most significant value in the byte sequence is stored at the lowest storage address, is typically used on UNIX systems. The little-endian format, where the least significant value in the sequence is stored first, is used on Windows.

Unicode support in PowerBuilder

PowerBuilder uses UTF-16LE encoding internally. The source code in PBLs is encoded in UTF-16LE, any text entered in an application is automatically converted to Unicode, and the string and character PowerScript datatypes hold Unicode data only. Any ANSI or DBCS characters assigned to these datatypes are converted internally to Unicode encoding.

Support for Unicode databases

Most PowerBuilder database interfaces support both ANSI and Unicode databases.

A Unicode database is a database whose character set is set to a Unicode format, such as UTF-8 or UTF-16. All data in the database is in Unicode format, and any data saved to the database must be converted to Unicode data implicitly or explicitly.

A database that uses ANSI (or DBCS) as its character set can use special datatypes to store Unicode data. These datatypes are NChar, NVarChar, and NVarChar2. Columns with one of these datatypes can store Unicode data, but data saved to such a column must be converted to Unicode explicitly.

For more specific information about each interface, see Connecting to Your Database.

String functions

PowerBuilder string functions, such as Fill, Len, Mid, and Pos, take characters instead of bytes as parameters or return values and return the same results in all environments. These functions have a "wide" version (such as FillW) that is obsolete and will be removed in a future version of PowerBuilder because it produces the same results as the standard version of the function. Some of these functions also have an ANSI version (such as FillA). This version is provided for backwards compatibility for users in DBCS environments who used the standard version of the string function in previous versions of PowerBuilder to return bytes instead of characters.

You can use the GetEnvironment function to determine the character set used in the environment:

environment env
getenvironment(env)

choose case env.charset
case charsetdbcs!
   // DBCS processing
   ...
case charsetunicode!
   // Unicode processing
   ...
case charsetansi!
   // ANSI processing
   ...
case else
   // Other processing
   ...
end choose

Encoding enumeration

Several functions, including Blob, BlobEdit, FileEncoding, FileOpen, SaveAs, and String, have an optional encoding parameter. These functions let you work with blobs and files with ANSI, UTF-8, UTF-16LE, and UTF-16BE encoding. If you do not specify this parameter, the default encoding used for SaveAs and FileOpen is ANSI. For other functions, the default is UTF-16LE.

The following examples illustrate how to open different kinds of files using FileOpen:

// Read an ANSI File
Integer li_FileNum
String s_rec
li_FileNum = FileOpen("Employee.txt")
// or:
// li_FileNum = FileOpen("Emplyee.txt", &
//    LineMode!, Read!)
FileRead(li_FileNum, s_rec)

// Read a Unicode File
Integer li_FileNum
String s_rec
li_FileNum = FileOpen("EmployeeU.txt", LineMode!, &
   Read!, EncodingUTF16LE!)
FileRead(li_FileNum, s_rec)

// Read a Binary File
Integer li_FileNum
blob bal_rec
li_FileNum = FileOpen("Employee.imp", Stream Mode!, &
   Read!)
FileRead(li_FileNum, bal_rec)

Initialization files

The SetProfileString function can write to initialization files with ANSI or UTF16-LE encoding on Windows systems, and ANSI or UTF16-BE encoding on UNIX systems. The ProfileInt and ProfileString PowerScript functions and DataWindow expression functions can read files with these encoding schemes.

Exporting and importing source

The Export Library Entry dialog box lets you select the type of encoding for an exported file. The choices are ANSI/DBCS, which lets you import the file into PowerBuilder 9 or earlier, HEXASCII, UTF8, or Unicode LE.

The HEXASCII export format is used for source-controlled files. Unicode strings are represented by hexadecimal/ASCII strings in the exported file, which has the letters HA at the beginning of the header to identify it as a file that might contain such strings. You cannot import HEXASCII files into PowerBuilder 9 or earlier.

If you import an exported file from PowerBuilder 9 or earlier, the source code in the file is converted to Unicode before the object is added to the PBL.

External functions

When you call an external function that returns an ANSI string or has an ANSI string argument, you must use an ALIAS clause in the external function declaration and add ;ansi to the function name. For example:

FUNCTION int MessageBox(int handle, string content, string title, int showtype)
LIBRARY "user32.dll" ALIAS FOR "MessageBoxA;ansi"

The following declaration is for the "wide" version of the function, which uses Unicode strings:

FUNCTION int MessageBox(int handle, string content, string title, int showtype)
LIBRARY "user32.dll" ALIAS FOR "MessageBoxW"

If you are upgrading an application from PowerBuilder 9 or earlier, PowerBuilder replaces function declarations that use ANSI strings with the correct syntax automatically.

Setting fonts for multiple language support

The default font in the System Options and Design Options dialog boxes is Tahoma.

Setting the font in the System Options dialog box to Tahoma ensures that multiple languages display correctly in the Layout and Properties views in the Window, User Object, and Menu painters and in the wizards.

If the font on the Editor Font page in the Design Options dialog box is not set to Tahoma, multiple languages cannot be displayed in Script views, the File and Source editors, the ISQL view in the DataBase painter, and the Debug window.

You can select a different font for printing on the Printer Font tab page of the Design Options dialog box for Script views, the File and Source editors, and the ISQL view in the DataBase painter. If the printer font is set to Tahoma and the Tahoma font is not installed on the printer, PowerBuilder downloads the entire font set to the printer when it encounters a multilanguage character. If you need to print multilanguage characters, specify a printer font that is installed on your printer.

To support multiple languages in DataWindow objects, set the font in every column and text control to Tahoma.

The default font for print functions is the system font. Use the PrintDefineFont and PrintSetFont functions to specify a font that is available on users' printers and supports multiple languages.

PBNI

The PowerBuilder Native Interface is Unicode based. PBNI extensions must be compiled using the _UNICODE preprocessor directive in your C++ development environment.

Your extension's code must use TCHAR, LPTSTR, or LPCTSTR instead of char, char*, and const char* to ensure that it works correctly in a Unicode environment. Alternatively, you can use the MultiByteToWideChar function to map character strings to Unicode strings. For more information about enabling Unicode in your application, see the documentation for your C++ development environment.

Unicode enabling for Web services

In a PowerScript target, the PBNI extension classes instantiated by Web service client applications use Unicode for all internal processing. However, calls to component methods are converted to ANSI for processing by EasySoap (obsolete), and data returned from these calls is converted to Unicode.

XML string encoding

The XML parser cannot parse a string that uses an eight-bit character code such as windows-1253. For example, a string with the following declaration cannot be parsed:

string ls_xml
ls_xml += '<?xml version="1.0" encoding="windows-1253"?>'

You must use a Unicode encoding value such as UTF16-LE.

Internationalizing the user interface

When you build an application for international deployment, there are two user interface design issues you should consider:

  • The physical design of the user interface

  • The cultural standards of your application's audience

Physical design

The physical design of the user interface should include:

  • Windows and objects with the flexibility to accommodate expanded string lengths required when the text in menu items, lists, and labels is translated

    For example, you could inherit a window from an English language ancestor window, and change the language for a localized deployment. Generally, you can accommodate the text for most languages if you allow for a menu item, list, or label size that is 1.3 times the length of an English text string.

  • Windows that can be easily used in RightToLeft versions of Windows

Cultural awareness

The cultural design of your user interface requires you to be cognizant of what is and is not acceptable or meaningful to your audience.

For example, an icon of a hand displaying an open palm might mean stop in one culture but indicate an unacceptable gesture in another. Similarly, although the color yellow signifies caution in some cultures, in other cultures it signifies happiness and prosperity.

Localizing the product

PowerBuilder provides resources for international developers that include localized runtime files and the Translation Toolkit. The localized files become available after the general release of a new version of PowerBuilder.

Localized runtime files

Localized runtime files are provided for French, German, Italian, Spanish, Dutch, Danish, Norwegian, and Swedish. You can install localized runtime files in the development environment or on the user's machine. If you install them on the development machine, you can use them for testing purposes.

The localized PowerBuilder runtime files handle language-specific data at runtime. They are required to display standard dialog boxes and user interface elements, such as day and month names in spin controls, in the local language. They also provide the following features:

  • DayName function manipulation

    The DayName function returns a name in the language of the runtime files available on the machine where the application is run.

  • DateTime manipulation

    When you use the String function to format a date and the month is displayed as text (for example, the display format includes "mmm"), the month is in the language of the runtime files available when the application is run.

  • Error messages

    PowerBuilder error messages are translated into the language of the runtime files.

Localized PFC libraries

The PFC is now available on the PowerBuilder Code Samples website at https://www.appeon.com/developers/library/code-samples-for-pb.

In order to convert an English language PFC-based application to another language such as Spanish, you need multiple components. You need to test the application on a computer running the localized version of the operating system with appropriate regional settings. You must also obtain or build localized PFC libraries and install the localized PowerBuilder runtime files. When you deploy the application, you must deploy it to a computer running a localized version of the operating system, and you must deploy the localized runtime files.

You can translate the PFC libraries with the Translation Toolkit. Localized PFC libraries are the same as the original PFC libraries except that strings that occur in windows, menus, DataWindow objects, dialog boxes, and other user interface elements, and in runtime error messages, are translated into the local language. These include, for example, day and month names in the Calendar service. All services remain otherwise the same. In a Spanish PFC application, error messages displayed by the PFC are in Spanish, month names in the Calendar service are in Spanish, column headers in DataWindow objects and Menu items are in Spanish, and so on.

The Translation Toolkit adds a string in the format %LANGUAGE% to the comment associated with every object that contains a translated string. For example, if you look at a PFC library that has been translated into Spanish in the List view in the Library painter, you will notice the string %SPANISH% at the beginning of the comment for many objects.

The dictionaries used to translate the PFC libraries into each language are provided with the Translation Toolkit. You can use the dictionaries to translate the rest of your application into a local language using the Translation Toolkit, and you can view the dictionary in a text editor to see which strings have been translated.

The localized PFC libraries work in coordination with the localized runtime files, regional settings, and the localized operating system.

Regional settings

PowerBuilder always uses the system's regional settings, set in the Windows Control Panel, to determine formats for the Date and Year functions, as well as date formats to be used by the SaveAs function. The use of these regional settings is independent of the use of PowerBuilder localized runtime files or PFC libraries.

The regional settings are also used to determine behavior when using Format and Edit masks. For more information, see the section called “Defining display formats” in Users Guide.

Localized operating system

The localized operating system is required for references to System objects, such as icons and buttons, that are referenced using enumerated types in PowerBuilder, such as OKCancel!, YesNo!, Information!, and Error!. These enumerated types rely on API calls to the local operating system, which passes back the appropriate button, icon or symbol for the local language. For example, if you use the OKCancel! argument in a MessageBox function, the buttons that display on the message box are labeled OK and Cancel if the application is not running on a localized operating system.

About the Translation Toolkit

The Translation Toolkit is a set of tools designed to help you translate PowerBuilder applications into other languages. It includes a standalone translator tool that is used by the person or group translating the text of the application.

When you use the Toolkit to create a project, a copy of each of your application's source libraries is created for each project. The application's original source libraries are not changed.

How the Toolkit works

You work with the phrases (one or more words of text) in an application. These phrases are in the application's object properties, controls, and scripts.

You use the tools to:

  • Extract phrases from the project libraries

  • Present the phrases for translation

  • Substitute translated phrases for the original phrases in the project libraries

Using the translated project libraries, you use PowerBuilder to build the translated application.

For more information, see the online Help for the Translation Toolkit.