Language technology development has accelerated rapidly. This is important not only for those who make a living from translation – be it interpreters or translators – but for all businesses that have to relate to different languages. Are the Nordic countries ahead of the curve or are the IT giants like Google, Apple and Microsoft about to take control over important parts of our languages?
Being able to translate with the help of computers seemed an impossible dream for a long time. The first attempts were made already in the 1950s with rule-based machine learning. But words are different from numbers. As the Danish Language Council put it in the introduction to its report “Danish world-class language technology”:
“A major part of our knowledge is formulated in a language. Most of what we know about Denmark, about Danish conditions and about each other is formulated in Danish. Artificial intelligence is usually based on the analysis of large amounts of data.
"You get good results when this data consists of numbers, but it is a much tougher challenge when data is made up of language in the shape of text and audio. Numbers are unambiguous and fit in with the way in which computers are organised. Language is ambiguous and far more complex since it is part of our existence and closely interwoven with the way our societies are made and the culture we grow up with.”
In the 1990s people believed it was possible to bypass the problem by going for statistical machine translation. With the enormous amounts of text available on the internet you just had to find someone who had translated something similar in the past, the argument went.
But many words have different meanings, and if you do not know anything about the context things often end up being wrong. Google translate had a bad reputation for a long time. But the translation service has improved considerably, especially between the larger languages like English and Spanish or English and French.
“Google and other players introduced new technology a few years back in their translation programs. The machine learning which is now being applied means machines can learn to recognise patterns through examples, rather than being programmed to translate individual words,” says the Swedish government report “Understanding and being understood”. (SOU 2018:83).
“We have gone from statistical translation to a model based on deep learning. The system can then understand context better, which makes translations better. A large amount of data is analysed and the computers look for patterns and learn to recognise them.
"This has made it possible to do translations that really stand out in terms of quality from the translation services that were presented around a decade ago. One crucial question is just how good the machine-based systems can become. There are researchers who claim there are nearly no limits to this.”
Language technology is about more than just translating texts. We now have a rich flora of different technology which in turn can be combined to work together:
These systems are also beginning to work in realtime, which means live TV news can get subtitles as they are being transmitted, or your search engine will guess what you are looking for after you have typed just a few letters.
In Denmark, one of the first success stories of language technology involved a voice recognition program used by doctors, Peter Juel Henrichsen told the Nordic language days which this year had language technology as a theme. The doctors could read their reports and have them written down in text by the program. This saved time and meant doctors no longer needed their secretaries.
This is why speech recognition accounts for 50% of the turnover of Danish language businesses.
“Later, the same program was tried by Danish municipalities, but it did not work as well for them. It needed to work for many different types of municipal worker, and there is a lot of difference between what a solicitor and a social worker does,” says Peter Juel Henrichsen.
Out of the 60 Danish municipalities that used the speech recognition program, none of them had a positive business case.
A new, large customer group is streaming providers like Netflix, HBO and Disney. They offer thousands of programmes and need to dub these or give them subtitles in hundreds of different languages. And this is not just about translating English programmes into other languages. We can for instance watch Korean or Spanish TV series with subtitles in our own language.
The Korean series Squid Game took only nine days to be the biggest success on Netflix in a language other than English. Photo: Youngkyu Park/Netflix
Using speech recognition programs for subtitles or translation of foreign films is not good enough, says Michel Stormbom at Finnish Lingsoft.
“Creating subtitles for a film is also about putting the text in at the right time and use time codes to indicate how long it should remain in vision. Because it is quicker to listen than to read, the subtitles must also be shortened and checked by humans.”
The Swedish company Plint, founded in 2002 to specialise in subtitles for company videos and the Swedish film industry, experienced a huge increase in jobs when Netflix started streaming films. The company’s turnover went from 11 million kronor (€1.1m) in 2015 to 241 million kronor (€23.8) in 2019. As soon as next year it could reach 500 million kronor (€49.4), CEO Örjan Serner told breakit.se.
Seven of the world’s 100 largest language technology companies are in the Nordics. Source: Nimdzi
The number of employees does not give a fair impression of how many people the company engages, since so much of their work is freelance based. Both Swedish Semantix and Danish LanguageWire claim to have a network of 7,000 language specialist who translate between nearly 250 languages, while Plint has a network of 1,000 translators.
The amount of translation being done has already gone beyond what is possible to do only with the help of human beings. But they are still needed to control and correct the translations that have been done. There will always be a need for translators of fiction who have knowledge of the spoken language, which develops faster than dictionaries.
So far we do not know very much about how conditions for interpreters and translators have changed as a result of technology, and how these platform-based jobs are being organised. We also do not know what impact this has on the languages. Will the technological development give smaller languages the opportunity to blossom or will English become ever more dominating?
The EU is one of the largest purchasers of translation services, with two million pages translated every year with the help of 2,000 in-house translators and supporting staff – in addition to thousands of freelancers.
Legal work makes up nearly half of all translations made in the EU. A larger version of the statistic can be found here:
When what would later become the EU was founded in 1958, there were four official languages: French, German, Dutch and Italian. Each new member state has had its language recognised as an official language, which means today everything is being translated into 28 languages.
Before Brexit, 13% of EU citizens spoke English. Today less than 1% do – Ireland and Malta are the only countries that have English as one of their main official languages. 38% of EU citizens have English as a second language, yet only one in five consider their English skills to be “very good”. No more than a quarter of EU citizens say they can understand what is being said in a radio programme or on the TV news.
Despite this lack of knowledge, nothing points to Brexit being followed with a weakening of the English language’s position in the EU’s institutions. On the contrary, believes Alice Neal, who herself has worked as an interpreter in the EU and who this year published a book called ”English and translation in the European Union” (Routledge).
She points out that preparatory work for new legislation is now nearly exclusively carried out using English. In 1997, 45% of the drafts for legislation and regulations were done in English. Ten years later this had risen to 62% before reaching 85% in 2020.
If the working language is English, why then spend 350 million euro on translation into the other languages?
The answer is that there is no main language to write EU legislation in. All the languages enjoy equal status and no language version is superior to another. The EU Court of Justice, like all other EU courts, must consider all language versions to be equally correct.
“When all the language versions are original, the borders for what is original and what is a translation of the original are erased and the linguistic hierarchy is hidden,” writes Alice Beal.
Maltese – an Arabic language with a Roman alphabet – has seen a massive lift as a result of the country’s EU membership. A unique Brussel-Maltese has emerged, containing words that are not used in the everyday language. Other, far bigger languages like Catalan, Basque and Romany have not seen the same translation support as Maltese, since they are not official EU languages.
A lot of the current discussion among Nordic linguists is about how important it is for countries themselves to keep control over the development and support of national terminology databases and how to safeguard confidentiality and integrity when using new language technology.
It may not be so smart to translate secret documents or private letters with account information or other sensitive information with an online translation service.
You only need to spend half an hour walking around the centre of Swedish Uppsala to see how many languages can be found on signs – often a mix of several languages like in "My Gyros - svensk och grekisk fastfood", or in the play on words Su shi fu - The amasian catch. Plays on words are some of the most difficult things to translate.