In one word, what comes to mind when you think about language and data collection?
Challenging, expensive, necessary.
These are some of the answers we heard from attendees at a roundtable discussion TWB facilitated during the 2020 GeOnG Forum.
Earlier this month, we were joined by panelists from IMPACT Initiatives, Mercy Corps, and the Internal Displacement Monitoring Centre (IDMC). We spoke about the role of language for data-driven humanitarian action and – crucially – how addressing language barriers can enable affected people to make their voices heard. The panelists shared their experiences and gave examples of why language is relevant at different stages of the data collection process. Here are five main conclusions from the discussion that are relevant for staff of any humanitarian organization that collects data:
1. Consult data on the languages people speak in the targeted area.
Some countries have sufficient data from government censuses to make an informed decision about the relevant language(s) targeted people speak and understand. However, this data isn’t always freely accessible, or easily verifiable. TWB is working with IMPACT Initiatives and other partners to make language data readily available to organizations that listen to and communicate with crisis-affected people. You can also collect this data during your survey to help fill the data gap.
2. Address language bias throughout the data collection process.
Language is usually only taken into account in the preparation phase when survey tools are translated into local language(s). This is often done hastily, without checking the translation quality. Mercy Corps highlighted the need to think carefully about language at each stage, from planning to data analysis and dissemination. This includes translating common questions and answers into as many languages as possible and with appropriate quality assurance procedures as a preparedness measure.
3. Support enumerators as needed and don’t make assumptions about their language skills.
Enumerators often take on many roles: administer a survey, but also act as interpreters, cultural mediators, program specialists and organizational representatives. Language support can take some of the burden off enumerators. Testing their literacy levels and comprehension of key terms can help screen enumerators and identify those that need additional training. Tools like glossaries can help them provide consistent and accurate translations of key terms in local languages and be confident that the person they are interviewing understands them.
4. Identify ways to deal with unstructured data.
Asking open-ended questions or including “other” as an answer option can allow us to understand a situation in the words of affected people themselves. But this data can be particularly difficult to translate and understand. Regular debriefs with enumerators during data collection can help check the quality of any free text data. Translating open-format answers into a language the data analysis team understands as soon as possible after the data is collected was another lesson highlighted during the session.
5. Use technology solutions appropriate to the context.
This could involve using a simple voice recorder as a quality assurance mechanism for multilingual surveys, as IDMC has piloted in northeast Nigeria. In other contexts, this might mean using Google Translate or other machine translation engines to translate information at speed. But this technology works best for major languages and machine translation needs to be approached with caution about anonymity and privacy. TWB and IMPACT Initiatives are developing machine translation and speech recognition tools adapted to humanitarian contexts and marginalized languages. Watch this space!
Interested to find out more? Check out this infographic with more than 20 language tips for effective humanitarian data collection. Watch the video-recording of the session here. And find information about the other sessions of the GeOnG Forum here.
Written by Mia Marzotto, Senior Communication Officer for TWB