- See also
Manual:MediaWiki architecture#Languages
(should be merged here?)
In MediaWiki, there are various kinds of languages:
- The
site content language
(
ContentLanguage
service in
MediaWiki\MediaWikiServices::getContentLanguage
, based on
$wgLanguageCode
), which should generally stay the same as long as the wiki exists.
- The
user interface language
(
$contextSource->getLanguage()
formerly
$wgLang
), which can be changed in your preferences or with
&uselang=xyz
in the URL, but also generally remains the same while using the wiki.
- The
page content language
. This can be different for each page, even if the site and user language is the same. It is defined in getPageLanguage() in
Title
, representing what language the page content (e.g., wikitext) source is written in.
- The
page view language
aka
user language variant
which is a language variant of page content language, as preferred by the user. It can also be set through
&variant
(or
$wgVariantArticlePath
) in the URL (e.g. by selecting one of the tabs)
if
it is a variant of the content language of the page being viewed. It's defined in getPageViewLanguage() in
Title
, representing what language the rendered HTML content is written in.
All three of them are
language objects
.
Language code
- Not to be confused with
Wikimedia project code
; see also
Language codes
on Meta.
A language code is a valid standard abbreviation for a language supported by MediaWiki,
[1]
which uses such codes as standard identifiers for languages (mostly in accordance with
ISO 639-3
, except two-letter codes from ISO 639-1 for "established" locales) and exposes or requires them in many points of the interface and code.
[2]
In the example below,
MediaWiki:Message/ar
,
ar
is the language code for Arabic.
Uniformity with Unicode standard is needed to provide a good language support, in particular in cooperation with
CLDR
; having an ISO 639-3 code is one
requirement for a language to be added to MediaWiki locales
.
Names.php
Names.php
is the master registry of languages supported by MediaWiki.
This is not the same as languages of which MediaWiki will show l10n (
JSON files
) nor languages of which MediaWiki knows the names (via
CLDR
), mind you!
Fallback languages
Some languages in MediaWiki have what is known as a "fallback sequence".
This is where MediaWiki will fall back on a different language if it cannot find what it needs.
An example of this is the language code
frc
(Cajun French), which falls back on the language code
fr
(French).
The reason for this is that sometimes languages don't have all messages defined in them.
The fallback for a language can be found in its associated
languages/messages/MessagesXX.php
file.
For instance
MessagesFrc.php
. You can
search the code for all uses
. There is also a plain list from September 2020 in
this Phabricator comment
.
Site content language
Viewing/getting the site content language
JavaScript
:
mw
.
config
.
get
(
'wgContentLanguage'
);
User interface language
MediaWiki version:
|
≥
1.18
|
- Default value
-
- Set via
- Special:Preferences
&uselang=zxx
in the URL (see
uselang
)
&variant=xy
(or
$wgVariantArticlePath
) in the URL
if
it is a variant of your user language
- Problems
- Since interface messages can come from fallback languages but the language is not returned, the actual language of each message is not known.
Page content language
MediaWiki version:
|
≥
1.18
|
- Default value
- $wgLang on special pages.
- English for CSS and JS pages.
- For MediaWiki namespace pages, the language depends on the subpage. For example, MediaWiki:Message/ar will be set to Arabic (ar), and MediaWiki:Message will be
ContentLanguage
.
- All other pages are
ContentLanguage
by default.
- Configuration
- Extensions can change all other pages through the
PageContentLanguage
hook. The value for special pages, CSS, JS, and MediaWiki namespace pages cannot be overridden.
- Examples
- The
Translate extension
uses it for the page translation feature. See
translatewiki:Project:About/ar
as a translation of
translatewiki:Project:About
. The directionality of the page is thus correctly set to right-to-left for Arabic.
MediaWiki version:
|
≥
1.24
|
- Manually changing page language
- Page language selection is now achievable with help of Special:PageLanguage since MediaWiki 1.24.
- Users can change content language of a page which is by default the default Wiki language (
ContentLanguage
). Language of pages in the MediaWiki namespace can't be changed.
- The feature needs to be enabled with
$wgPageLanguageUseDB
=
true
and the
pagelang
permission must be granted to a wiki user rights group (who can then perform page language changes).
- Changing page language causes the source translation page and its units to be moved to the correct target language. In case the target language translation page already exists, the language change isn't allowed.
- Matching API can be found on
API:SetPageLanguage
.
- What does it define?
- In SkinTemplate, it adds a
<
div
lang
=
"xyz"
dir
=
"ltr/rtl"
class
=
"mw-content-ltr/rtl"
></
div
>
around the page text. The dir attribute sets the correct
writing direction
. The lang attribute will always be the root code, e.g. "de" even when "de-formal" is given.
- For file pages, it is set in ImagePage.php, because there is a lot of HTML that is in the user language.
- In Parser.php, it sets the table of contents (TOC) numberings, and stuff like grammar, although not really relevant mostly. To do that only, use parserOptions->setTargetLanguage().
- The direction of the diff text (DifferenceEngine) is set to the page content language. In some cases this is not identical, in which case $diffEngineObject->setTextLanguage( $code ) can be used.
- Since
1.19
, it also sets the time and number-formatting
magic words
, including DIRECTIONMARK, but not NAMESPACE(E), as that really depends on the site language. Note that including a template marked as language A onto a page with language B, will be parsed with language B on that page.
- Multiple languages on a single page
- Multiple languages on a single page are in theory not supported, but simple
<
div
lang
=
"xyz"
dir
=
"ltr/rtl"
class
=
"mw-content-ltr/rtl"
>
tags can be used to mark text as being written in a different language. If the CSS class is used, the ul/ol lists and editsection will display nicely when the dir tag is opposite to the value of that of the page content language. Things defined in the parser, like TOC and magic words, however, will not change.
- Viewing/getting the page language
- JavaScript
:
mw.config.get( 'wgPageContentLanguage' )
- note that, when e.g. viewing the page history, it will return the page language of the page it is the history of, whereas the history page doesn't have an mw-content-ltr/rtl class. I.e. both "/wiki/Page" and "/w/index.php?title=Page&action=history" will return the language of "Page".
1.19+
- The page content language is mentioned on the page info view (
action=info
, linked in the toolbox)
1.21+
- The page content language can be retrieved in the
API
via
api.php?action=query&prop=info
1.22+
Code structure
First, you have a Language object in
Language.php
.
This object contains all the localisable
message strings
, as well as other important language-specific settings and custom behavior (uppercasing, lowercasing, printing dates, formatting numbers,
direction
,
custom grammar rules
etc.
).
The object is constructed from two sources: sub-classed versions of itself (classes) and Message files (messages).
There's also the MessageCache class, which handles input of text via the MediaWiki namespace.
Most internationalisation is nowadays done via
Manual:Messages API
objects and by using the
wfMessage()
shortcut function, which is defined in
includes/GlobalFunctions.php
.
Legacy code might still be using the old
wfMsg*()
functions, which are now considered deprecated in favor of the above-mentioned Message objects.
See also
Manual:Messages API
.
Language objects
There are two ways to get a language object.
You can use the globals
$wgLang
and
ContentLanguage
service (
MediaWiki\MediaWikiServices::getContentLanguage
) for user interface and content language respectively.
For an arbitrary language you can construct an object by using
$languageFactory
->
getLanguage
(
'en'
)
by replacing
en
with the code of the language.
You can get
$languageFactory
, an object of the
MediaWiki\Languages\LanguageFactory
class, using
Dependency Injection
.
You can also use
wfGetLangObj
(
$code
);
if
$code
could already be a language object.
The list of codes is in
includes/languages/data/Names.php
.
Language objects are needed for doing language-specific functions, most often to do number, time and date formatting, but also to construct lists and other things.
There are multiple layers of caching and merging with
#Fallback languages
, but the details are irrelevant in normal use.
Old local translation system
With MediaWiki 1.3.0, a new system was set up for localising MediaWiki.
Instead of editing the language file and asking developers to apply the change, users could edit the interface strings directly from their wikis.
This is the system in use as of August 2005.
People can find the message they want to translate in
Special:AllMessages
and then edit the relevant string in the
MediaWiki:
namespace.
Once edited, these changes are live.
There was no more need to request an update, and wait for developers to check and update the file.
The system is great for Wikipedia projects; however a side effect is that the MediaWiki language files shipped with the software are no longer quite up-to-date, and it is harder for developers to keep the files on meta in sync with the real language files.
As the default language files do not provide enough translated material, we face two problems:
- New Wikimedia projects created in a language which has not been updated for a long time, need a total re-translation of the interface.
- Other users of MediaWiki (including Wikimedia projects in the same language) are left with untranslated interfaces. This is especially unfortunate for the smaller languages which don't have many translators.
This is not such a big issue anymore, because translatewiki.net is advertised prominently and used by almost all translations.
Local translations still do happen sometimes but they're strongly discouraged.
Local messages mostly have to be deleted, moving the relevant translations to translatewiki.net and leaving on the wiki only the site-specific customisation; there's a huge backlog especially in older projects,
this tool
helps with cleanup.
Keeping messages centralised and in sync
English messages are very rarely out of sync with the code.
Experience has shown that it's convenient to have all the English messages in the same place.
Revising the English text can be done without reference to the code, just like translation can.
Programmers sometimes make very poor choices for the default text.
What can be localised
So many things are localisable on MediaWiki that not all of them are directly available on
translatewiki.net
: see
translatewiki:Translating:MediaWiki
.
If something requires a
developer
intervention on the code, you can
request it on Phabricator
, or ask at
translatewiki:Support
if you don't know what to do exactly.
- Fallback languages
(that is, other more closely related language(s) to use when a translation is not available, instead of the default fallback, which is English)
- Directionality (left to right or right to left, RTL)
- Direction mark character depending on RTL
- Arrow depending on RTL
- Languages where italics cannot be used
- Number formatting (comma-ify,
i.e.
adding or not digits separators; transform digits; transform separators)
[3]
- Truncate (multibyte)
- Grammar conversions for inflected languages
- Plural transformations
- Formatting expiry times
[
clarification needed
]
- Segmenting for diffs (Chinese)
- Convert to variants of language (between different orthographies, or scripts)
- Language specific user preference options
- Link trails
and link prefix -
$linkTrail
. These are letters that can be glued after/before the closing/opening brackets of a wiki link, but appear rendered on the screen as if part of the link (that is, clickable and in the same colour),
e.g.
:
[[foo]]bar
. By default the link trail is "a-z"; you may want to add the accentuated or non-Latin letters used by your language to the list.
- Language code (preferably used according to the latest RFC in standard BCP 47, currently
RFC
5646
, with its associated IANA database. Avoid deprecated, grandfathered and private-use codes: look at what they mean in standard ISO 639, and avoid codes assigned to collections/families of languages in ISO 639-5, and ISO 639 codes which were not imported in the IANA database for BCP 47)
- Type of emphasising
- The
Cite
extension has a special page file per language,
cite_text-
zyx
for language code
zyx
.
Neat functionality:
- I18N
sprintfDate
- Roman numeral formatting
Namespaces
Currently making namespace name translations is disabled on translatewiki.net, so you need to do this yourself in Gerrit, or file a
Phabricator
task asking for someone else to do it.
[4]
To allow custom namespaces introduced by your extension to be translated, create a
MyExtension
.i18n.namespaces.php
file that looks like this:
<?php
/**
* Translations of the namespaces introduced by MyExtension.
*
* @file
*/
$namespaceNames
=
[];
// For wikis where the MyExtension extension is not installed.
if
(
!
defined
(
'NS_MYEXTENSION'
)
)
{
define
(
'NS_MYEXTENSION'
,
2510
);
}
if
(
!
defined
(
'NS_MYEXTENSION_TALK'
)
)
{
define
(
'NS_MYEXTENSION_TALK'
,
2511
);
}
/** English */
$namespaceNames
[
'en'
]
=
[
NS_MYEXTENSION
=>
'MyNamespace'
,
NS_MYEXTENSION_TALK
=>
'MyNamespace_talk'
,
];
/** Finnish (Suomi) */
$namespaceNames
[
'fi'
]
=
[
NS_MYEXTENSION
=>
'Nimiavaruuteni'
,
NS_MYEXTENSION_TALK
=>
'Keskustelu_nimiavaruudestani'
,
];
Then load it from the
extension.json
file using ExtensionMessagesFiles like this:
{
"name"
:
"MyExtension"
,
"version"
:
"0.0.1"
,
"descriptionmsg"
:
"myextension-desc"
,
"ExtensionMessagesFiles"
:
{
"MyExtensionNamespaces"
:
"MyExtension.i18n.namespaces.php"
}
}
Now, when a user installs MyExtension on their Finnish (fi) wiki, the custom namespace will be translated into Finnish magically, and the user doesn't need to do a thing!
Also remember to register your extension's namespace(s) on the
Extension default namespaces
page.
Special page aliases
See
the manual page for Special pages
for up-to-date information.
The following does not appear to be valid.
Create a new file for the special page aliases in this format:
<?php
/**
* Aliases for the MyExtension extension.
*
* @file
* @ingroup Extensions
*/
$aliases
=
[];
/** English */
$aliases
[
'en'
]
=
[
'MyExtension'
=>
[
'MyExtension'
]
];
/** Finnish (Suomi) */
$aliases
[
'fi'
]
=
[
'MyExtension'
=>
[
'Lisaosani'
]
];
Then load it from the
extension.json
file using ExtensionMessagesFiles like this:
{
"name"
:
"MyExtension"
,
"version"
:
"0.0.1"
,
"descriptionmsg"
:
"myextension-desc"
,
"ExtensionMessagesFiles"
:
{
"MyExtensionAlias"
:
"MyExtension.i18n.alias.php"
}
}
When your special page code uses either
SpecialPage
::
getTitleFor
(
'MyExtension'
)
or
$this
->
getTitle
()
(in the class that provides Special:MyExtension), the localised alias will be used, if it's available.
Namespace name aliases
Namespace name aliases are additional names which can be used to address existing namespaces.
They are rarely needed, but not having them when they are, usually creates havoc in existing wikis.
You need namespace name aliases:
- When a language has variants, and these variants spell some namespaces differently, and you want editors to be able to use the variant spellings. Variants are selectable in the user preferences. Users always see their selected variant, except in wikitext, but when editing or searching, an arbitrary variant can be used.
- When an existing wiki's language, fall back language(s), or localisation is changed, with it are changed some namespace names. So as not to break the links already present in the wiki, that are using the old namespace names, you need to add each of the altered previous namespace names to its namespace name aliases, when, or before, the change is made.
The generic English namespace names are always present as namespace name aliases in all localisations, so you need not, and should not, add those.
Aliases can't be translated on
Translatewiki.net
, but can be requested there or on
Bugzilla
: see
translatewiki:Translating:MediaWiki#Namespace name aliases
.
Regional settings
Some linguistic settings vary across geographies; MediaWiki doesn't have a concept of region, it only has languages and language variants.
These settings need to be set once as a language's default, then individual wikis can change them as they wish in their configuration.
Time and date formats
Time and dates are shown on special pages and alike.
The default time and date format is used for signatures, so it should be the most used and most widely understood format for users of that language.
Also anonymous users see the default format.
Registered users can choose other formats in their preferences.
If you are familiar with PHP's time() format, you can try to construct formats yourself.
MediaWiki uses a similar format string, with some extra features.
If you don't understand the previous sentence, that's OK.
You can provide a list of examples for
Developers
.
See
Help:System message#Message sources
.
Notes
See also