“Initially we are using the database to create simplified English versions of Wikipedia medical pages; the simplified Wikipedia articles are then posted for the English-speaking population”, explains Val Swisher, CEO of Content Rules, TWB board member and leader of the Simple Words for Health Project. “At that point, the simplified articles are translated into many different languages and posted to the appropriate language-specific Wikipedia page.”
At the origin of this project was a need to simplify the English for medical content that needed to be translated. “I jumped at the opportunity to help,” said Swisher. She gathered a large community of volunteer medical editors to work on the simplified English terminology. Many of the editors had already been working on TWB’s WikiProject: Medicine that aims to translate 100 top medical articles into 100 languages. “We had two issues,” explains Swisher. “We needed to standardize the terms we were using across all the editors in the team, and we needed medical personnel to validate the simplified English terms in order to ensure accuracy.”
The creation of the system was supported by Andrew Bredenkamp, Founder of Acrolinx and TWB board member, who donated an instance of the Acrolinx linguistic platform to the project. With the help of a volunteer from Acrolinx, the software harvested 12,000 terms from the top 100 most popular Wikipedia medical articles and divided into nine categories: Brain and Senses, Cancer, Heart Health, Immune System, Infection, Mental Health, Maternal Health, Public Health, Trauma. These categories were assigned to a team of 31 volunteer English medical editors who simplified all 12,000 terms in only 8 weeks.
Before the terms could be released to the public, Swisher gathered a community of 15 volunteer physicians in each of the nine specialties to ensure its accuracy. This careful validation process took a full six months. Further validation and cleanup is in-process, such as removing duplicate terms and removing terms that do not need to be simplified.
“The biggest challenge was finding volunteer physicians to validate our work and then getting the validations back from the physicians,” explains Swisher. “After attempting to find physicians through a number of major medical communities in the U.S., we ended up having the most success through social media, tapping into international sources and doctors with large social media networks. I posted the need to the HIFA2015 community (an international community of medical professionals who are committed to “health information for all by 2015”) and received 13 responses from physicians all over the world.
“By the end of the lengthy validation process, we added two physicians from the UCSF Medical School where we have been working with medical students on the Wikipedia medical articles.”
The Simple Words for Health database is entering its post-validation phase. Once completed, the team of volunteers will then work on simplifying the top 100 most popular medical articles on Wikipedia, with the help of the Acrolinx platform and the newly completed Simple Words database – a task for which more volunteers are needed, ideally with a background in medical editing.
The Simple Words database will also be used for additional TWB projects, and plans are underway to make this freely available to all in the near future.
“Translators in Kenya have reported that it is much easier to translate articles that we simplify first. However, that was before we created the database,” said Swisher. “I think it will be even simpler now.”
Follow us on Twitter at http://www.twitter.com/TranslatorsWB.