Some words are ‘untranslatable’, or rather, they resist translation, certainly, the kind of translation that is achieved through machine translation.
The notion of the untranslatable has long preoccupied linguists and translators. Anna Wierzbicka calls them ‘key words’, culturally saturated terms that defy direct substitution.
Examples of such terms include the German ‘weltschmerz’, which first emerged from German Romanticism to describe the melancholic realisation that the world cannot satisfy the soul’s deepest longings. It is not quite depression or despair, but the sorrow of a disenchanted idealist. Google Translate offers ‘world-pain’, which doesn’t even come close to capturing its meaning.
Mono no aware (物の哀れ) is used to describe an awareness of the transience of things, a sort of aestheticised melancholy. Japanese classical literature is steeped in this sensibility; think, for example, of a passage describing the feeling when cherry blossoms fall as seasons change. Machine translation often renders this term as ‘the pathos of things’, which isn’t the worst direct translation, but nonetheless, is difficult to appreciate without additional context. Google Translate gives ‘the sadness of things’.
Often translated as ‘homesickness’, cumha is the Irish word for deep, expansive sorrow, the sort that comes from exile, lost time, or absence. It persists throughout the Irish poetic tradition, from laments for the dead to longing for a homeland that might never be seen again. Translation software often flattens it to ‘sadness’ or ‘grief’, almost eradicating the cultural register.
In each case, nuance lives in context. These words carry emotion, but also a philosophy, and when translated mechanically, they shed their cultural specificity.
The issue of course, as explained in Mona Baker’s In Other Words, is that some terms have no direct equivalent in the target language because the referent just does not exist there.
In Against World Literature, Emily Apter argues that the very friction of the untranslatable is what makes literature interesting. While Apter and Barbara Cassin’s Dictionary of Untranslatables assembles philosophical concepts that remain radically singular in their original languages, markers of cultural thought-worlds that don’t reduce neatly into each other.
Machines try (and fail) to translate nuance
Machine translation systems have grown dramatically more sophisticated in the past decade. Statistical models gave way to neural machine translation, and more recently, to transformer-based large language models. But despite these gains, cultural nuance remains a difficult problem for machine translation.
The problem is AI systems continue to erase culturally embedded concepts by mapping them onto more ‘universal’ but less precise counterparts. This flattening is not accidental of course, it’s a statistical artefact of the training process.
Researchers like Marine Carpuat are pioneering methods of ‘explicitation’, prompting systems to explain unfamiliar words rather than substitute them. Rather than translating cumha as ‘grief’, an LLM might say, ‘a deep sorrow felt when far from home, often expressed in traditional Irish song’, which is closer to how a human translator might operate.
Other researchers like Diyi Yang and Binwei Yao have introduced cultural-awareness benchmarks (e.g. CAMT), to evaluate how well machine translation systems handle culturally specific items. For what it’s worth, GPT-4 stumbles on culture-bound terms, often defaulting to incorrect paraphrasing.
The cost of clarity
There’s no doubt that machine translation has opened up the linguistic world in extraordinary ways. Vast amounts of text can now be rendered from one language into another with remarkable speed and fluency. For languages with a strong digital presence and a robust corpus of training data, machine translation offers a kind of frictionless access, it enables a monolingual reader to make sense of an unfamiliar text, even if that understanding is rough or partial.
In this sense, we gain reach, the ability to communicate across borders and language families at scale. We also gain efficiency, because machine-generated drafts can be edited by human translators rather than written from scratch. And in low-resource contexts, machine translation can be a vital stopgap, enabling documentation, messaging, even emergency communication in languages that lack robust linguistic infrastructure.
But there are losses too, some obvious, others subtle.
First and foremost, there is a loss of texture. Translation is not just about getting the gist across, it’s about preserving tone, form, affect, rhythm, and cultural embeddedness. Cumha is not just grief. Mono no aware is not just sadness. Weltschmerz is not just disillusionment. These words exist within long literary and philosophical traditions and are saturated with meaning. Machine translation systems trained on general-purpose corpora struggle to recognise, let alone replicate, this depth.
There’s also a loss of interpretive flexibility. Human translators often pause over ambiguous or loaded terms, making deliberate choices about how to convey nuance. Machines don’t pause, they calculate and move on. When nuance becomes statistically inconvenient, it is just discarded.
More worrying still is the risk of epistemic flattening. Machine translation systems often standardise culturally specific concepts to fit dominant linguistic patterns, especially those of English. In doing so, they promote a kind of semantic monoculture. The ethical stakes here are not trivial. When a language’s unique expressions are persistently misrepresented or ignored, we edge towards epistemic injustice, the systematic undervaluing of certain ways of knowing and expressing. Minority languages are most at risk. If meitheal becomes just ‘teamwork’, we are not simply losing vocabulary, we are losing encoded values, social arrangements, and affective postures.
This isn’t to say machine translation is a threat in itself. It’s a tool, but like any powerful tool, it needs ethical calibration; that might mean prompting systems to preserve source-language terms where appropriate, or training them on more diverse corpora, or introducing new evaluation metrics that account for nuance, not just fluency.
Ultimately, the challenge isn’t to eliminate loss. Some degree of loss is inevitable in any act of machine translation. The question is whether we can lose things more gracefully, with transparency, care, and a recognition of what’s at stake.