Conversations with chatbots: helping people in the DRC access multilingual COVID-19 information

“How is coronavirus different from Ebola?”

“What are the symptoms of Corona?”

“How many times a day should I wash my hands?”

“How else can I protect myself from Corona?”

These are questions that people are asking in the Democratic Republic of Congo in Lingala, French, and Congolese Swahili. And their questions are being answered by a bot, in their own language.

The bot’s name is “Uji,” which is short for ukingo and jibu, which mean “prevention” and “response” respectively. Uji is TWB’s first multilingual chatbot and a key part of making sure people have the health information they want, in their own language.

Uji supports collaborative and two-way communication

Everyone has the right to access the information they need and want, when they want it, and in a language they understand. Yet frequently information is only available in global commercially-viable languages, or in the national languages of a country. Furthermore, this information is often only available in a top-down manner, with humanitarians and health agencies deciding what information people can and should receive.

TWB has long advocated for humanitarians and development professionals to integrate multilingual technology in their programs. This allows people living through crises to proactively and independently get answers to their questions. And with the COVID-19 pandemic related restrictions denying crisis-affected people access to humanitarians, new communication tools are needed.

Uji unites language and technology to bring us closer to this vision of truly equitable information access.

The development of Uji

Access to credible, multilingual COVID-19 information is a challenge in the DRC. “Many Lingala and Congolese Swahili speakers in the DRC are accessing COVID-19 information from different radio shows, websites, and posters,” explains Rodrigue Bashizi, TWB’s DRC Community Engagement Officer. “But the main challenge for accessing COVID-19 information is the cost of internet bundles in the country. Sometimes people receive videos talking about COVID-19, but they can’t open them due to a lack of good internet and the cost of bundles.”

People needed a better solution for their COVID-19 questions. Enter Uji. Rodrigue says, “Uji is a very important tool for people in DRC because they lack trusted information. Since Uji is on Telegram and WhatsApp, it will not consume a lot of internet bundles. It is easy to use. Once it is on SMS it will even be available for people in remote areas with no internet access.”

Rodrigue is from Bukavu in the DRC and speaks Swahili, French, English, Lingala, Kinyarwanda and Luganda. Before joining TWB, he worked as a trainer with refugees in Uganda. At TWB, he is a core member of the team developing our multilingual chatbots for two-way communications. Rodrigue is passionate about technology and says he loves working on chatbots, as he is learning something new every day.

Rodrigue and other TWB team members developed the tool in partnership with Kinshasa Digital, a DRC communication agency that was already working with the DRC Ministry of Health to develop a COVID-19 chatbot. By collaborating with Kinshasa Digital and bringing multilingual technology to the existing bot, we will be able to reach more people, in more languages.

TWB developed Uji in French, Congolese Swahili, and Lingala. The bot responds to a wide range of questions about COVID-19, from debunking popular rumors, to tips on how to help children cope with stress due to COVID-19. We are working on expanding its scope to also respond to questions about Ebola. The chatbot is available on WhatsApp and Telegram. By using existing messaging platforms people can access COVID-19 information wherever they are, whenever they want. Whether they are at home, on the bus, or at work, they can find the information they need, right from their phone.

To engage with Uji, users message their COVID-19 questions to the chatbot on WhatsApp or Telegram. They can ask their questions in French, Congolese Swahili, or Lingala. The bot automatically responds in the language in which the question was asked.

The questions were ready and the bot was developed. But before launching the bot fully across these platforms, we needed to test and perfect it.

Linguist-tested and approved

Uji is a work in progress, and it requires human testing in multiple languages to make sure it’s effective and useful. Rodrigue led the testing efforts with volunteers from TWB’s community of translators, IFRC, and other partners. At the beginning of the process, Uji had to learn to understand questions and match responses accurately. But with time and testing, Uji has improved dramatically. And feedback from our community of testers is positive:

“The bot is making great progress in Swahili.”

“It’s getting harder to get an answer that doesn’t match the question. Seems the bot is improving continuously.”

Not only is this individual feedback important, but nearly 70% of users who participated in our satisfaction survey about the bot report that they find the information useful. The chatbot also allows TWB to gather insights about what questions are asked most frequently and what languages are used most often. Humanitarian and health organizations can use this data to tailor their communication strategies, to better provide the information that people want.

We will continue to improve Uji in the coming weeks and months, and welcome additional feedback from users.

The future of TWB chatbots

We hope that Uji is the start of a global restructuring of how multilingual conversations happen. Our aim is to demonstrate Uji’s value as a successful multilingual two-way communication channel in the DRC, and then expand the model into additional countries and for additional uses.

We encourage humanitarian and development professionals to consider incorporating chatbots and other language technology into their programming.

To learn more about incorporating chatbot and language technology into your programming, email corona@translatorswithoutborders.org.

Written by Krissy Welle, TWB’s Senior Communications Officer

The latest from TWB’s language technology initiative

Leaping over the language barrier with machine translation in Levantine Arabic

When a language you don’t understand appears in your Facebook news feed, you can click a button and translate it. This kind of language technology offers a way of communicating not just with the millions of people who speak your language, but with millions of others who speak something else.

Or at least it almost does.

Like so many other online machine translation systems, it comes with a caveat: it is only available in major languages.

TWB is working to eliminate that rather significant caveat through our language technology initiative, Gamayun. We named it after a mythical birdwoman figure in Slavic folklore — she is a magical creature that imparts words of wisdom on the few who can understand her. We think she’s a perfect advocate for language technology to increase digital equality and improve two-way communication in marginalized languages.

We have reached an important Gamayun milestone by leaping over the language barrier with a machine translation engine in Levantine Arabic. Here is how we got here, what we learned, and what is next.

What is behind developing a machine translation engine in Levantine Arabic?

In November 2019, we joined forces with a group of innovators and language engineers from PNGK and Prompsit to address WFP’s Humanitarian Action Challenge. Our goal was to use machine translation to enhance the way aid organizations understand the needs and concerns of Syrian refugees, to improve food security programming.

So we developed a text-to-text machine translation (MT) engine for Levantine Arabic tailored to the specifics of refugees’ experiences. To achieve this, we collaborated with Mercy Corps’ Khabrona.Info team. The team runs a Facebook page for Syrian Arabic refugees to provide them with reliable information and answers, such as about accessing food and other support. We took content shared on the Khabrona.Info Facebook page and manually translated it into English to adapt the engine. The training data and a demonstration version of our MT are available on our Gamayun portal.

How well does this machine translation engine perform?

To answer this question, we conducted an evaluation based on tests widely used by MT researchers. We found that our MT engine produced better translations for Levantine Arabic than one of the most used online machine translation systems.

We first asked experienced translators to rate the translations for both accuracy and fluency. We provided them with ten randomly selected source texts and translations generated by humans, Google’s MT, and our MT. All translations were fairly good, with scores ranging from zero for no errors to three for critical errors. Our MT engine performed slightly better than Google’s MT because it was adapted to the specifics of Levantine Arabic and its online colloquialisms about food security and other topics relevant to refugees’ experiences. The human translations performed slightly better than our MT, but were not perfect.

We also asked the experienced translators to rank the best, second best, and worst translations based on each source text. While the human translations were consistently ranked higher than both machine translation engines, our MT was preferred 70% of the time over Google’s MT.

We then used the standard metric for automated MT quality testing called BLEU. The bilingual evaluation understudy scores an MT translation according to how well it matches a reference for human translation. Scores range from zero for no match to 1.0 for a perfect match, but few translations score 1.0 because all translators will produce slightly different texts. Our generic MT engine trained on publicly available parallel English-Arabic text obtained a 0.195 score on a testing set of 200 social media posts. With further training with a small but specific set for Levantine Arabic and its online colloquialisms, it reached a 0.248 score. Instead, the Google MT translations scored 0.212 on the same testing set.

Take the short sentence أسعار المواد الغائية مرتفعة as an example: humans translated it as “food is expensive” and our MT returned “food prices are high;” Google’s MT, instead, translated it as “the prices of the materials are high.” All are grammatically correct results, but our MT tended to better pick up the nuances of informal speech than Google’s MT. This may seem trivial, but it is critical if MT is used to quickly understand requests for help as they come up or keep an eye on people’s concerns and complaints to adjust programming.

What makes these results possible?

We specifically designed our MT engine to provide reliable and accurate translations of unstructured data, such as the language used in social media posts. We involved linguists and domain experts in collating and editing the dataset to train the engine. This ensured a focus on both humanitarian domain language and colloquialisms in Levantine Arabic.

The agility of this approach means the engine can be used for various purposes, from conducting needs assessments to analyzing feedback information. The approach also meets the responsible data management requirements of the humanitarian sector.

What have we learned?

We have demonstrated that it is possible to build a translation engine of reasonable quality for a marginalized language like Levantine Arabic and to do so with a relatively small dataset. Our approach entailed engaging with the native language community and focusing on text scraped from social media. This holds great potential for building language technology tools that can spring into action in times of crisis and be adapted to any particular domain.

We also learned that even human translations for Levantine Arabic are not perfect. This shows the importance of building networks of translators for marginalized languages who can help build up and maintain language technology. Where there are not enough—if any—professional translators, a key first step is training bilingual people with the right skills and providing them with guidance on humanitarian response terminology. This type of capacity building can not only make technology work for marginalized language speakers in the longer term, but also ensure they have access to critical information in their languages in the shorter term.

What’s next?

We are refining our approach, augmented by external support, to achieve the full potential of language technology. We are currently working with the Harvard Humanitarian Initiative and IMPACT Initiatives using natural language processing and machine learning to transcribe, translate, and analyze large sets of qualitative responses in multilingual data collection efforts to inform humanitarian decision making. We have also joined the Translation Initiative for COVID-19 (TICO-19), alongside researchers at Carnegie Mellon and major tech companies including Amazon, Facebook, Google, and Microsoft to develop and train state-of-the-art machine translation models in 37 different languages on COVID-19.

Stay tuned to learn how we move forward with these projects. We’ll continue to develop language technology solutions to enhance two-way communication in humanitarian crises and amplify the voices of millions of marginalized language speakers.

Written by Mia Marzotto, Senior Advocacy Officer for Translators without Borders.

Language data fills a critical gap for humanitarians

Until now, humanitarians have not had access to data about the languages people speak. But a series of open-source language datasets is about to improve how we communicate with communities in crisis. Eric DeLuca and William Low explain how a seemingly simple question drove an innovative solution.

“Do you know what languages these new migrants speak?”

Lucia, an aid worker based in Italy, asked this seemingly simple question to researchers from Translators without Borders in 2017. Her organization was providing rapid assistance to migrants as they arrived at the port in Sicily. Lucia and her colleagues were struggling to provide appropriate language support. They often lacked interpreters who spoke the right languages and they asked migrants to fill out forms in languages that the migrants didn’t understand.

Unfortunately, there wasn’t a simple answer to Lucia’s question. In the six months prior to our conversation with Lucia, Italy registered migrants from 21 different countries. Even when we knew that people came from a particular region in one of these countries, there was no simple way to know what language they were likely to speak.

The problem wasn’t exclusive to the European refugee response. Translators without Borders partners with organizations around the world which struggle with a similar lack of basic language data.

Where is the data?

As we searched various linguistic and humanitarian resources, we were convinced that we were missing something. Surely there was a global language map? Or at least language data for individual countries?

The more we looked, the more we discovered how much we didn’t know. The language data that does exist is often protected by restrictive copyrights or locked behind paywalls. Languages are often visualized as discrete polygons or specific points on a map, which seems at odds with the messy spatial dynamics that we experience in the real world.

In short, language data isn’t accessible, or easily verifiable, or in a format that humanitarians can readily use.

We are releasing language datasets for nine countries

Today we launch the first openly available language datasets for humanitarian use. This includes a series of static and dynamic maps and 23 datasets covering nine countries: DRC, Guatemala, Malawi, Mozambique, Nigeria, Pakistan, Philippines, Ukraine, and Zambia.

This work is based on a partnership between TWB and University College London. The pilot project received support from Research England’s Higher Education Innovation Fund, managed by UCL Innovation & Enterprise. With support from the Centre for Translation Studies at UCL, this project was the first of its kind in the world to systematically gather and share language data for humanitarian use.

The majority of these datasets are based on existing sources — census and other government data. We curated, cleaned, and reformatted the data to be more accessible for humanitarian purposes. We are exploring ways of deriving new language data in countries without existing sources, and extracting language information from digital sources.

This project is built on four main principles:

TWB Language Data Initiative

1. Language data should be easily accessible

We started analyzing existing government data because we realized there was a lot of quality information that was simply hard to access and analyze. The language indicators from the 2010 Philippines census, for example, were spread over 87 different spreadsheets. Many census bureaus also publish in languages other than English, making it difficult for humanitarians who work primarily in English to access the data. We have gone through the process of curating, translating, and cleaning these datasets to make them more accessible.

2. Language data should work across different platforms

We believe that data interoperability is important. That is, it should be easy to share and use data across different humanitarian systems. This requires data to be formatted in a consistent way and spatial parameters to be well documented. As much as possible, we applied a consistent geographic standard to these datasets. We avoided polygons and GPS points, opting instead to use OCHA administrative units and P-codes. At times this will reduce data precision, but it should make it easier to integrate the datasets into existing humanitarian workflows.

We worked with the Centre for Humanitarian Data to develop and apply consistent standards for coding. We built an HXL hashtag scheme to help simplify integration and processing. Language standardization was one of the most difficult aspects of the project, as governments do not always refer to languages consistently. The Malawi dataset, for example, distinguishes between “Chewa” and “Nyanja,” which are two different names for the same language. In some cases, we merged duplicate language names. In others, we left the discrepancies as they exist in the original dataset and made a note in the metadata.

Even when language names are consistent, the spelling isn’t always. In the DRC dataset, “Kiswahili” is displayed with its Bantu prefix. We have opted instead to use the more common English reference of “Swahili.”

Every dataset uses ISO 639-3 language codes and provides alternative names and spellings to alleviate some of the typical frustrations associated with inconsistent language references.

3. Language data should be open and free to use

We have made all of these datasets available under a Creative Commons Attribution Noncommercial Share Alike license (CC BY-NC-SA-4.0). This means that you are free to use and adapt them as long as you cite the source and do not use them for commercial purposes. You can also share derivatives of the data as long as you comply with the same license when doing so.

The datasets are all available in .xlsx and .csv formats on HDX, and detailed metadata clearly states the source of each dataset along with known limitations.

Importantly, everything is free to access and use.

4. Language data should not increase people’s vulnerability

Humanitarians often cite the potential sensitivities of language as the primary reason for not sharing language data. In many cases, language can be used as a proxy indicator for ethnicity. In some, the two factors are interchangeable.

As a result, we developed a thorough risk-review process for each dataset. This identifies specific risks associated with the data, which we can then mitigate. It also helps us to understand the potential benefits. Ultimately, we have to balance the benefits and risks of sharing the data. Sharing data helps humanitarian organizations and others to develop communication strategies that address the needs of minority language speakers.

In most cases, we aggregated the data to protect individuals or vulnerable groups. For each dataset, we describe the method we used to collect and clean the data, and specify potential imitations. In a few instances, we chose to not publish datasets at all.

How can you help?

This is just the beginning of our effort to provide more accessible language data for humanitarian purposes. Our goal is to make language data openly available for every humanitarian crisis, and we can’t do it alone. We need your help to:

Integrate and share this data. We are not looking to create another data portal. Our strategy is to make these datasets as accessible and interoperable as possible using existing platforms. But we need your feedback so we can improve and expand them.
Add language-related questions into your ongoing surveys. Existing language data is often outdated and does not necessarily represent large-scale population movements. Over the past year, we have worked with partners such as IOM DTM, REACH, WFP, and UNICEF to integrate standard language questions into ongoing surveys. This is essential if we are to develop language data for the countries that don’t have regular censuses. The recent multi-sectoral needs assessment in Nigeria is a good example of how a few strategic language questions can lead to data-driven humanitarian decisions.
Use this language data to improve humanitarian communication strategies. As we develop more data, we hope to provide the tools for Lucia and other humanitarians to design more appropriate communication strategies. Decisions to hire interpreters and field workers, develop radio messaging, or create new posters and flyers should all be data-driven. That’s only possible if we know which languages people speak. An inclusive and participatory humanitarian system requires two-way communication strategies that use languages and formats that people understand.

Clearly, the answer to Lucia’s question turned out to be more complicated than any of us expected. This partnership between TWB and the Centre for Translation Studies at UCL has finally made it possible to incorporate language data into humanitarian workflows. We have established a consistent format, an HXL coding scheme, and processes for standardizing language references. But the work does not stop with these nine countries. Over the next few months we will continue to curate and share existing language datasets for new countries. In the longer term we will be working with various partners to collect and share language data where it does not currently exist. We believe in a world where knowledge knows no language barriers. Putting language on the map is the first step to achieving that.

Eric DeLuca is the Monitoring, Evaluation, and Learning Manager at Translators without Borders.

William Low is a Senior Data and GIS Researcher at University College London.

Funding for this project was provided by Research England’s Higher Education Innovation Fund, managed by UCL Innovation & Enterprise.

Language Technology Could Help 157 Million People Get Access To Information

I was exhausted. It had been a great week in Bangladesh, but the overload of language, smells, refugee camp, seeing old friends, meeting new friends, government, donors, and all the while pretending like I wasn’t jetlagged, was taking its toll. I just wanted to go to sleep.

My last meeting was in Dhaka with someone in the Prime Minister’s office. I had little hope of staying awake through the meeting.

And yet, I was captivated.

The literacy rate in Bangladesh is considered low (72.8% according to UNESCO in 2016) but is just below the global average. Literacy among women is lower (69.9%); but, in general, the majority of the people have at least basic literacy skills. There is 90 percent mobile phone penetration and 96 percent mobile internet access. The International Mother Language Institute, the body in Bangladesh that supports the promotion, spread, and preservation of Bangla languages, says that 41 languages are spoken in the country, only five of which have written scripts. In the humanitarian response for Rohingya refugees in Cox’s Bazar, Translators without Borders (TWB) finds the situation particularly difficult. Rohingya has no agreed written script. Very few of the refugees can read and write, there are few people who speak Rohingya and anything else well. Add to this mix low radio coverage – not only do the Rohingya not have radios, even if they did there is not even radio coverage in parts of the camps, and about one million people living in poor and difficult conditions that speak many different dialects and you begin to understand why communicating effectively is difficult.

It’s vitally important that there is two-way communication between the people – refugees and local Bangladeshis – and the government and aid workers. Take the issue of the coming monsoon. The formal and makeshift refugee camps have sprouted up all over the Cox’s Bazar district, an area that includes a national park and lush forest. But now the trees have been torn down to make room for shelters and for firewood. This makes the soil very unstable and dangerous, with monsoon rains promising huge mud pits and the possibility of landslides. It is also a hilly area; tents are built on the sides of hills that will become slippery and unstable with heavy rains and wind. Refugees, as well as local residents, need to know where to go, what to do if there’s an emergency, how to get help for those needing medical attention, and what to do if food gets swept away.

The challenges abound. The digital world seems a world away.

And yet, enter Dr. Jami. In a buzzy, busy office with a high level of excitement and a relatively good gender balance, I was suddenly in the middle of a high tech environment. Dr. Jami launched directly into what he wanted us to know and do.

Dr. Jami runs the Access to Information (A2I, inevitably) project in the Prime Minister’s office. The aim is to help the people of Bangladesh quickly and easily get information on public services. One of A2I’s projects is the digitization of government institutions; they have developed over 1,000 key government websites. Dr. Jami is not a language guy (he’s a solutions architect), but he proceeds to tell me quickly that Bangla was only standardized in Unicode five years ago, so there is very little data available from which to build good translation engines. While there’s 90 percent mobile phone penetration, in 2018 GSMA estimated that only 28-30 percent of those were smartphones. Yet, 96 percent of internet access is via phones. Whaaa? How does that work? It’s also startling how little desktops and laptops are used to access the internet.

I asked a taxi driver, who was using a smartphone, if he used his phone for the internet. He replied, “No, but I use it for Facebook.”

There are no data charges for Facebook in Bangladesh – unless you want to see videos or pictures. Internet use is Facebook and Facebook is only text. Those who are illiterate, or only barely literate, won’t have smartphones.

To Dr. Jami, who needs more people to have smartphones to help ensure they can get access to information, the cost is not the barrier: There are very inexpensive smartphones in Bangladesh. He believes it is fear of technology, which he believes is associated with illiteracy. To reach his goal of migrating 70 percent of the current mobile phone users to smartphones, he must address fear.

Language is an issue. With a population of over 157 million people, and one of the most widely spoken languages in the world, you’d think that the language technology for Bangla would be outstanding. It’s not. That’s surprising. And without that technology, equipping 1,000 websites with dynamic information in Bangla is nearly impossible, not to mention making them interactive and/or adding audio.

The work that A2I is doing is globally relevant, of course. Other countries are already seeking their support to bring better access to information to their people. He mentions that they are already working in South Sudan – which has the 2nd lowest literacy rate in the world. Again, the language barrier is huge. And, again, there is little digital language data.

Dr. Jami has heard of TWB’s Gamayun project – can we help? Can we be a neutral broker to bring together the limited language data out there and leverage our knowledge of language and the language industry to help Bangladeshis get access to information about basic services?

Dr. Jami and the TWB team will continue this conversation – there are still many questions to be asked and answered. But I was impressed by the enthusiasm and the accomplishments of his team. And I am really excited to see where Dr. Jami and other countries take this exciting initiative.

Written by Translators without Borders' Executive Director Aimee Ansari. This article was also published on HuffPost UK.

Read a related post on The #LanguageMatters blog, ‘Language: Our Collective Blind Spot in the Participation Revolution’. In TWB’s last blog post, Executive Director Aimee Ansari explains why we need to create and disseminate a global dataset on language and communication for crisis-affected countries.

Language: Our Collective Blind Spot in the Participation Revolution

Two years ago, I embarked on an amazing journey. I started working for Translators without Borders (TWB). While being a first-time Executive Director poses challenges, immersing myself in the world of language and language technology has by far been the more interesting and perplexing challenge.

Students, Writing, Language — Students practising to write Rohingya Zuban (Hanifi script) in Kutupalong Refugee Camp near Cox’s Bazar, Bangladesh.

Language issues in humanitarian response seem like a “no-brainer” to me. A lot of others in the humanitarian world feel the same way – “why didn’t I think of that before” is a common refrain. Still, we sometimes struggle to convince humanitarians that if people don’t understand the message, they aren’t likely to follow it. When I worked in South Sudan for another organisation, in one village, I spoke English, one of our team interpreted to Dinka or Nuer, and then a local teacher translated to the local language (I don’t even know what it was). I asked a question about how women save money; the response had something to do with the local school not having textbooks. It was clear that there was no communication happening. At the time, I didn’t know what to do to fix it. Now I do – and it’s not difficult or particularly expensive.

That’s the interesting part. TWB works in 300 languages, most of which I’d never heard of, and this is a very small percentage of the over 1,300 languages spoken in the 15 countries currently experiencing the most severe crises. There’s also no reliable data on where exactly each language is spoken. I’ve learned so much about language technology that my dog can almost talk about the importance of maintaining translation memories and clean parallel datasets.

Communicating with conflict-affected people

The International Committee of the Red Cross and the Harvard Humanitarian Initiative have just published a report about communicating with conflict-affected people that mentions language issues and flags challenges with digital communications. (Yay!) Here are some highlights:

Language is a consistent challenge in situations of conflict or other violence, but often overlooked amid other more tangible factors.
Humanitarians need to ‘consider how to build “virtual proximity” and “digital trust” to complement their physical proximity.’
Sensitive issues relating to sexual and gender-based violence are largely “lost in translation.” At the same time, key documents on this topic are rarely translated and usually exclusively available in English.
Translation is often poor, particularly in local languages. Some technology-based solutions have been attempted, for example, to provide multilingual information support to migrants in Europe. However, there is still a striking inability to communicate directly with most people affected by crises.

TWB’s work, focusing on comprehension and technology, has found that humanitarians are simply unaware of the language issues they face.

In north-east Nigeria, TWB research at five sites last year found that 79% of people wanted to receive information in their own language; less than 9% of the sample were mother-tongue Hausa speakers. Only 23% were able to understand simple written messages in Hausa or Kanuri; that went down to just 9% among less educated women who were second-language speakers of Hausa or Kanuri, yet 94% of internally displaced persons receive information chiefly in one of these languages.

In Greece, TWB found that migrants relied on informal channels, such as smugglers, as their trusted sources of information in the absence of any other information they could understand.
TWB research in Turkey in 2017 found that organizations working with refugees were often assuming they could communicate with them in Arabic. That ignores the over 300,000 people who are Kurds or from other countries.
In Cox’s Bazar, Bangladesh, aid organizations supporting the Rohingya refugees were working on the assumption that the local Chittagonian language was mutually intelligible with Rohingya, to which it is related. Refugees interviewed by TWB estimate there is a 70-80% convergence; words such as ‘safe’, ‘pregnant’ and ‘storm’ fall into the other 20-30%.

What can we do?

Humanitarian response is becoming increasingly digital. How do we build trust, even when remote from people affected by crises?

‘They only hire Iranians to speak to us. They often can’t understand what I’m saying and I don’t trust them to say what I say.’ – Dari-speaking Afghan man in Chios, Greece.

Speak to people in their language and use a format they understand: communicating digitally – or any other way – will mean being even more sensitive to what makes people feel comfortable and builds trust. The right language is key to that. Communicating in the right language and format is key to encouraging participation and ensuring impact, especially if the relevant information is culturally or politically sensitive. The right language is the language spoken or understood and trusted by crisis-affected communities; the right format means information is accessible and comprehensible. Providing only written information can hamper communication and engagement efforts with all sectors of the community from the start – especially women, who are more likely to be illiterate.

Lack of data is the first problem: humanitarians do not routinely collect information about the languages people speak and understand, or whether they can read them. It is thus easy to make unsafe assumptions about how far humanitarian communication ‘with communities’ is reaching, and to imagine that national or international lingua francas are sufficient. This can be done safely without harming the individuals or putting the community at risk.

Budgets: Language remains below the humanitarian radar and often absent from humanitarian budgets. Budgeting for and mobilizing trained and impartial translators, interpreters and cultural mediators can ensure aid providers can listen and provide information to affected people in a language they understand.

Language tools: Language information fact-sheets and multilingual glossaries can help organizations better understand key characteristics of the languages affected people speak and ensure use of the most appropriate and accurate terminology to communicate with them. TWB’s latest glossary for Nigeria provides terminology in English/Hausa/Kanuri on general protection issues and housing, land and property rights.

A global dataset on language

TWB is exploring ways of fast-tracking the development and dissemination of a global dataset on language and communication for crisis-affected countries, as a basis for planning effective communication and engagement in the early stages of a response. We plan to complement this with data mining and mapping of new humanitarian language data.

TWB has seen some organizations take this on – The World Health Organization and the International Federation of Red Cross and Red Crescent Societies have both won awards for their approaches to communicating in the right language. Oxfam and Save the Children regularly prioritize language and the International Organization for Migration and the United Nations Office for the Coordination of Humanitarian Affairs are starting to routinely include language and translation in their programs. A few donors are beginning to champion the issue, too.

TWB has only really been able to demonstrate the possibilities for two or three years – and it’s really taking off. It’s such a no-brainer, so cost-effective, it’s not surprising that so many organizations are taking it on. Our next step is to ensure that language and two-way communication are routinely considered, information is collected on the languages that crisis-affected people speak, accountability mechanisms support it, and we make the overall response accessible for those who need protection and assistance.

Written by Aimee Ansari, Executive Director, Translators without Borders.