One of Mike Lynch’s bot creation is the Glossatory, which generates definitions of words based a recurrent neural network. Mike illustrates some of the more outlandish ones at @email@example.com
The Glossatory is based on a recurrent neural network, which is the type of artificial neural network that’s typically used for machine translation. Mike used a training set of 82,115 definitions that were taken from WordNet. The results are often amusing, frequently incomprehensible, but mostly they’re just odd.
I love the Glossatory, but it also highlights some meaningful issues about the use of machine translation. Recurrent neural networks have almost entirely replaced earlier machine translation approaches, such as statistical machine translation . This is because neural networks require markedly less memory and the translations tend to be more accurate because the neural model is trained end-to-end . Google translate switched from using statistical machine translation to neural machine translation a few years ago, as have Microsoft and Yando .
While neural network machine translations might be more accurate than other approaches, they’re far from perfect—as Glossatory shows. Because of this, a lot of research activity has been focused on making translations more accurate and reducing processor demands .
Some colleagues and I have been taking a different tack. We’ve been looking at how translation apps and websites that are based on recurrent neural networks are already being used in health care settings. This is for a couple of reasons:
- We’re diverse: many users of Australia health services come from culturally and linguistically distinct communities, and may have distinct cultural and health beliefs .
- It’s not just about literacy: levels of health literacy (the ability to gain and understand health information to make decisions and follow instructions) vary markedly within the Australian community .
- There are meaningful risks: the consequences of mistranslation in health care settings can have significant consequences. It’s not simply a matter of being given bad directions. People in health care settings may be expected to make decisions without understanding what they’re agreeing to—or expected to understand information that is simply incomprehensible.
For example, this is what Google translate does when you translate information on how to prepare for a colonoscopy into Nepali:
There may be some compounding errors due to this being translated back and forth from Nepali, but it illustrates the underlying issue pretty well. Parenthetically, this translation is much better than the one that Google translation provided nine months ago, which was “You need to drink some fluids before your colopyopause that will open your eyes and remove from stool so that we can see your face face.” Machine translation are getting better, particularly for non-Romance languages that have been markedly less accurate in the past .
In response to these concerns, my colleagues and I have undertaken research looking at the use of machine translation in health services. We presented on this at last week’s Australasian Association for Academic Primary Care conference in Adelaide.
The slides below outline the findings of the first phase of our research, based on a survey of more than 1,500 health employees. The key findings are that machine translation is already being quite widely used, and it’s often clinicians who are initiating use.
We’re currently undertaking qualitative research to further understand the dynamics of how machine translation is used. Further research is also needed on how the use of translation apps and websites is perceived by consumers, carers, and family members. We also need to examine if the use of machine translation plays out differently in primary health care, social care, and residential aged care.
Machine translation may present significant opportunities but we should resist the urge to uncritically adopt its use in all settings and for all purposes. Tomáš Svoboda expresses this well:
“Super-linguists” will be needed to trim the linguistic engines and, eventually, override the recommendations of machines. For a long time still, complex semantics and the interpretation of text passages will remain the art of human involvement in processing the input and output. Technology is not set to become autonomous in the medium term. It can serve people very well if applied with boldness and creativity, as well as with responsibility.
In the meantime we can continue to learn from the Glossatory: