5 things to take into consideration when going global

How to speak your users' language

Published in Food for thought, Front-end Development on Oct 20, 2023

So you have an application or website, and have decided to go support new markets, or even go fully global. Exciting!

In this article, I will go into 5 things that I have seen developers (and others in tech!) do wrong. If you are one of them: don’t worry. Even the biggest companies and organisations do this wrong all the time!

Because this isn’t necessarily an article for developers, I am not relying on code examples. The brief pieces of code you come across, might not work at all. They are only meant as illustrations, and I don’t hesitate to make up a language if that helps understand the example for non-coders.

Without further ado, let’s dive into them.

1: Different word orders

When you have static text on your website with dynamic variables, we tend to use structures like this:

"Welcome to " . $city_name . ". Glad to have you here!";

This works perfectly fine if your site or application only supports one language, which is the case for most sites. When you start supporting different languages, the first thing you might do, is add translation methods around all of your text strings:

__("Welcome to ") . $city_name . __(". Glad to have you here!");

There are two major problems with this approach:

1.1: Your string text translations get hard

Now, you would have these two strings to translate:

// String 1:
"Welcome to";
  
// String 2:
". Glad to have you here!";

Notice how you have inconsistent punctuation to translate. You also miss important context when translating, because you only get half of the text.

1.2: The word order might change in different languages

This is the actual problem here: not all languages have the same word order they use. For example, these two sentences mean the same. Note how the variable is in a different place in the text:

  1. Welcome to Rotterdam!
  2. Rotterdam mein aapka swagat hai!

There’s no way to take this into consideration when developing with the technique above.

Luckily, there is a very easy way to solve both issues: don’t separate the variables from your literal text, but take them into the translation string. You could just use a placeholder for this. Most languages have a way to replace variables. In PHP, this would look something like this:

printf(__(“Welcome to %s!), $city_name)

Now, a translator can translate the entire sentence, whilst having the entire context. Their translation to Hindi could look like this:

printf(__(%s mein aapka swagat hai!), $city_name)

2: Grammatical numbers

If you don’t know what grammatical numbers entails, that’s not weird at all. It describes the feature in a language where you change a noun based on count distinctions. In English, there are only two forms regarding grammatical numbers:

  1. Singular: 1 sentence
  2. Plural: 2 sentences

Contrary to what you might instinctively think, 0 is part of the group plural. Hence, it is 0 sentences, the plural form.

So in English, you could define a very easy format:

if ($count === 1) {
  return __("1 sentence");
} else {
  return __("%d sentences", $count);
}

What is interesting, is the realisation that some languages have different forms for this concept.

Arabic, for example, has two types of plurals: paucal plurals, and true plurals. The true plurals are the same as you know them in English, but the paucal system might be completely new to you. In that case:

Nouns with numbers 3-10 have a different word ending than true plurals. If this concept would exist in English, it would look somewhat like this:

  • 1 apple
  • 2 apples
  • 3 applo
  • [..]
  • 10 applo
  • 11 apples
  • 12 apples
  • [..]

So, defining the format, like we did above, will not work here. And it’s not just Arabic we have to take into consideration. There are a lot of languages with different formats. To name just a few: Irish, Scottish Gaelic, Lithuanian, Slovene, Russian, Polish, Serbo-Croatian, and many more. These are not old, obscure or unused languages. They are very much alive and you should definitely consider them and their users.

So. How do you do that?

Firstly: don’t build your own solution. There are too many systems out there, and you don’t want to get involved in that, unless you are seriously passionate about this topic.

Instead, use existing solutions for this. Some frameworks already have something built-in, like Laravel:

// Translation string
"{0} No apples|{1} 1 apple|{2,*} :count apples";

// Usage
trans_choice('messages.apples', $count);

If your framework does not have something like this built-in, ICU MessageFormat is an easy way to get started:

{count, plural,
    =0 {No apples}
    =1 {1 apple}
    other {{count} apples}
}

This way, translators can take into consideration their own language features. An Arabic translator would translate the above string to the following:

{count, plural,
    =0 {لا توجد تفاحات}
    =1 {تفاحة واحدة}
    =2 {تفاحتان}
    few {{count} تفاحات}
    other {{count} تفاحة}
}

As you can see (even without being able to read Arabic), two forms have been added: the dual form (2) and the paucal form (few). ICU handles the meaning of few for you.

3: Different word lengths

This one is extremely easy to both understand and avoid in the future.

Be mindful that the length of a text can differ enormously between languages, even for just a single word.

When you are building an interface with buttons underneath each other, it makes sense that you want the buttons to be of the same width, because that simply looks good.

But that’s no reason to fix the width so the word “Logout” fits properly in there, if there’s any change that it will ever have to be translated to another language. Imagine you are translating the application to Portuguese (250 million speakers). Now, all of a sudden, your button will have text that does no longer fit in the button.

Why? Because the Portuguese word for “Logout” is “Encerrar sessão”, which is at least twice as long. And nearly every word that you can think of, will have translations in some languages that are way longer (or shorter). The three-letter word “Now” can be translated to French in a few ways, but none of them will be any shorter than “Maintenant”, which is more than 3 times as long.

4: Different emoji & symbol meanings

A helpful way to deal with the problem of different word lengths, would be to use icons where space is a limiting factor. In a lot of situations, this works great.

However, you should take into consideration that the meaning of emojis and other symbols can also differ per language. Take this one for example:

👍

Easy, right? That is “an indication of satisfaction or approval”. We all agree on that.

No. We don’t. In a few regions and cultures in the world, this actually means something closer to “up yours, pal”.

For this, luckily, there is an easy solution as well: simply let translators translate the emojis as well. Make sure to give them context, though, so they understand what the intended purpose of the emoji is.

5: Sorting order differences

Lastly, there are differences in the way different languages sort the letters in their alphabet, and therefore how they would sort words or names.

In German, for example, letters with an umlaut are regularly put after the letter they belong to. Therefore, the ä would follow directly after the a. A German visitor, who looks for a person named “Mähler”, would scroll back when they come across “Meier”.

In Swedish, however, letters with umlauts are put at the end of the alphabet. Compare the two alphabets:

Language Alphabet
German aäbcdefghijklmnopöqrstuüvwxyz
Swedish abcdefghijklmnopqrstuvwxyzäöü

The result of this difference would be that two users would expect a list of names to be sorted differently:

German Swedish
Malm Malm
Mähler Meier
Meier Mähler
Möller Möller
Müller Müller

Therefore, you should take into consideration the activated language before actually sorting. JavaScript has localeCompare for this. PHP has strcoll to help you.

That’s all!

Or, well, it definitely isn’t all. Languages and cultures are incredibly interesting and, more importantly, are always evolving. Things that are now accurate, might change in the future.

Remember that you will never get it perfect, but you should aim to get close. And if you give translators enough tools, they will happily help you to get it right.