public final class Internationalization
extends Object
i18n
".l10n
".The objective is that messages be shown to the user in his or her native language, not in English (or whatever language the developer was working in). Note that internationalization does not actually mean that you, as an application developer, need to translate your software to all the locales where it is planned to be used. This is done later, by translators. Applications developers just need to be aware of this, and prepare their software to be localized later. This preparation is what we call internationalization.
First of all, we need to allow for message translation. This simply means not outputting hard coded English Strings directly and instead calling a function that will redirect to the message translation. Similarly, the other aspects of internationalization such as the way dates, numbers, currency, etc are formatted is locale-dependent. The functions in this class will enable you to prepare your application for translation.
Let's take the following Java code as a typical example:
greeting = new Label("Good morning");Fortunately, you do not need to know how this message is written in all languages of the world! You let the translators take care of that.
One approach would have been to give your sources to each translation team and have them hunt through the code for each and every String. This would be quite a nightmare, of course, and worse would require the translators to be a programmer on par with yourself. Worst of all, it would require a copy of the application specific to each locale to be distributed, which of course would be ridiculous.
The people doing translations tend not to be hard-core developers but they are making a very valuable contribution to your software by offering to translate it. So we undertake internationalization in a way that enables them to do the localization without needing to be Java programmers.
Instead of the hard coded String shown above, we wrap it in the translation function as follows:
greeting = new Label(_("Good morning"));The
Internationalization._()
function takes
care of looking up your String in the translation database and will return
the same message localized to the user's native language. Obviously using
the static import:
import static org.freedesktop.bindings.Internationalization._;will make things clean and elegant.
In java-gnome, internationalizing your apps is as easy as the code above.
You should use the _()
function with any message you want to
show to the user. Messages intended for developers only (such as debug
messages going to the log) do not need to be localized.
Matters become somewhat more complicated when you need to concatenate
various parameters into a composite String. Consider the common use of the
+
operator:
System.out.println("The file " + filename + " was modified on " + date);This code is not internationalized at all. Not only do we need to allow the message to be translated, but we must also allow for the the date to be formatted appropriately. In java-gnome, you do this as follows:
System.out.println(_("The file {0} was modified on {1,date,long}", filename, date));The added complexity comes from the need to cater for the "positional parameters" (here
filename
and date
) which may
have a different order when rendered in someone else's language. Again, you
don't need to know the specifics for every possible target locale, you just
need to supply the information in a form that can be localized.
To format these parameters, you use the MessageFormat
syntax. Put
briefly, you will:
{n}
in your message, where
n
is the 0
-based index of the parameter as you
submitted to the _()
function.number
", "date
", "time
") and optionally a format style qualifier (in the
above example, "long
", indicating a longer form). See the
MessageFormat documentation for further details.
Of course, computers don't make magic (still), so you still need a
translation process to actually localize your messages. However, with this
approach this is done outside the code, in files named "message catalogues"
where the translated messages are stored. In fact, the _()
function will look up the given message in that catalogues, to show the
translated version to the user. As a developer, you don't need to create
them, it is the task of translator. However, knowledge of that process is
useful, so we outline it below.
LANG
and
LC_MESSAGES
environment variables.
Examples of locales include:
en_CA.UTF-8
en_UK.UTF-8
en
es_ES.UTF-8
es
fr_CA.UTF-8
fr_FR.UTF-8
fr
.UTF-8
which
are how a locale indicates support for specific character sets.
There is also one other locale you will see:
C
Note that nothing requires you to to use English for the untranslated messages in your source code. English is, however, the lingua franca of our age, and more to the point is the language which most translation teams understand and translate from. If you are doing your own translations, then go right ahead and program in whatever language you want. On the other hand, if you wish to leverage the GNOME Translation Project's expertise, we recommend that your untranslated Strings be in basic English.
Indeed, using uncomplicated English Strings will mean that you will be less
likely to "break" Strings (thereby causing the translation teams'
localizations to no longer work). Even if you are a native English speaker,
we recommend that you localize your own work into (say) en_AU
or en_CA
as this will cause you to be aware of translation
issues - and will minimize String breaks.
As you can imagine, the gettext and glibc libraries fallback in a
predictable order. They try to find a translation that is appropriate for
your locale, starting with the fully qualified LANG
variable
and then steadily degrading. For example, if your
LANG=fr_CA.UTF-8
, you could expect the following sequence of
locales to be searched:
java-gnome uses the GNU gettext suite, which is the
same translation infrastructure that GNOME and most other Linux
applications use. The process used by gettext
to generate the
message catalogues is as follows:
First of all, the messages used in your code need to be extracted. This is
done by the xgettext
command. It is able to distinguish
between translatable messages and other Strings because the former are
marked with the calls to _()
. So, the following call:
$ xgettext -o myapp.pot --omit-header --keyword=_ --keyword=N_ path/to/TYPE.javawill extract the messages used in the TYPE.java class to a file called
myapp.pot
. A POT file is a template with the list of
translatable messages which translators will use to know the message they
must translate. With the command:
$ msginit -i myapp.pot -o ${LANG}.pothey can generate a PO file for their particular language which is where the messages will be translated. PO files contain the translated messages. There is one PO file per locale, named with the standard scheme
language_COUNTRY
. Examples are en_CA.po
,
es_ES.po
, fr.po
, pt.po
,
pt_BR.po
.... Typically, the PO files of a given project are
stored in a directory named po/
.
To be used by gettext, those files need to be "compiled" to a binary form,
known as MO files. That is done with the msgfmt
command:
$ msgfmt -o myapp.mo es.poMO files are installed together with other application artifacts, usually under
/usr/share/locale/${locale}/LC_MESSAGES/${packageName}.mo
,
where ${locale}
is the locale and ${packageName}
is a unique identifier for the program, usually your application name. This
name is needed because when installed to the system, MO files are stored
under a directory common to all installed apps. For example, localized
messages for the pt_BR
locale of the myapp
application will be packaged as
/usr/share/locale/pt_BR/LC_MESSAGES/myapp.mo
.
As you can imagine, you will need to tell gettext where those files are
physically located. This is done with the init()
method. This must be called before any usage of java-gnome
internationalization infrastructure.
public void main(String[] args) { Gtk.init(args); Internationalization.init("myapp", "share/locale/"); ... }
In some cases, this might be a problem. If you have messages stored in a
static array initializer use the N_()
function to mark
these messages, then use _()
later on the variable carrying
the constant. See N_()
for more details.
Modifier and Type | Method and Description |
---|---|
static String |
_(String msg,
Object... parameters)
Translate and format a given message according to user locale.
|
static void |
init(String packageName,
String localeDir)
Initialize internationalization support.
|
static String |
N_(String msg)
Mark the given message as translatable, without actually translating
it.
|
static String |
translateCountryName(String name)
Translate a country name.
|
static String |
translateLanguageName(String name)
Translate a language name.
|
public static final String _(String msg, Object... parameters)
This attempts to translate a text string into the user's native
language. You just need to call it with the message in C
,
as follows:
String translated = _("Hello");
The java-gnome implementation of _()
also supports message
formatting and concatenation in a language-neutral way. For example,
let's suppose we want to print the following message: "The file
'data.log' has been modified at March 21, 2008 at 5:27:22 PM". This
message actually has two parameters, the filename, and the date of
modification. This data is locale-dependent, as the dates are
represented differently depending on language and country. We could get
the internationalized message with:
String filename; Date date; _("The file '{0}' has been modified on {1,date,long} at {1,time}", filename, date);As you can see, it is easy to construct a given message from several parameters, even when some parameters need locale-dependent formatting.
The actual formatting is done by MessageFormat
, so take a look
at its documentation for all available format options. Translation is
done handled by gettext before the message is passed to MessageFormat
for further handling of the positional parameters.
msg
- The message to print. This is the untranslated message,
usually in English.parameters
- Parameters of the messagepublic static final void init(String packageName, String localeDir)
public void main(String[] args) { Gtk.init(args); Internationalization.init("myapp", "/usr/share/locale/"); ... }
packageName
- Application namelocaleDir
- Directory where to find the message catalogues (usually
/usr/share/locale
) The actually message
catalogue is found at
${localeDir}/${locale}/LC_MESSAGES/${packageName}.mo
For example:
/usr/share/locale/pt_BR/LC_MESSAGES/myapp.mo
.
It is not compulsory to use an absolute path for the
localeDir
parameter.public static final String N_(String msg)
private static final String BUTTON_MESSAGE = N_("Press me!");You still need to call
_()
later, to actually translate
the message.
button.setLabel(_(BUTTON_MESSAGE));Obviously the problem now is to come up with constant names that are unobtrusive. There are various different naming schemes that can be employed; all are somewhat ugly. In general this leads to people not using Strings in static initializers as much as they might have been used to. Indeed, the whole point of abstracting out such Strings (so that they are in one place at the "top" of the file) is less relevant given that the gettext tools will be extracting all your messages anyway.
msg
- The message to mark as translatablemsg
argument, not translated. Remember,
N_()
is only used to mark a String as translatable
so that xgettext
can extract it.public static final String translateCountryName(String name)
LANG="fr_CA"
,
lang = translateCountryName("United Kingdom");will get you
"Royaume-Uni"
.
To use this function you must already have done the lookup of the ISO
3166 country code to its name in English. Unfortunately there's no
automated way to do this. The iso-codes
package, however,
countains an XML file at
/usr/share/xml/iso-codes/iso_3166.xml
with the necessary
data.
This function uses dgetttext()
and the
"iso_3166"
translation domain. Translations for
countries in this standard should already be available on your
system.
public static final String translateLanguageName(String name)
LANG="es_ES.UTF-8"
and do:
lang = translateCountryName("Japanese");you will get
"Japonés"
.
In order to use this function, you must already have done the lookup of
the ISO 639 language code (jp
in this case to what the
standard has the name in English as. Unfortunately there's no automated
way to do this, however the iso-codes
package countains an
XML file at /usr/share/xml/iso-codes/iso_639.xml
with this
data.
Beware that you really need to use the proper name; es
(which is Spanish to mere mortals) is "Spanish; Castilian"
in the XML data [and hence in the message catalogues].
This function uses dgetttext()
and the
"iso_639"
translation domain. Translations for
languages in this standard should already be available on your
system.