Science Fiction, Writing Process

The Linguistics of Project Hail Mary

I love science fiction that gets the science wrong.


I mean that sincerely. I’ve reread the Dragonriders of Pern more times than is dignified, and I am wholly untroubled by telepathic dragons who shouldn’t be able to physically fly, much less survive their internal combustion engines. X-wings? Starfleet vessels? Banking and roaring through the vacuum of space? I love it. Hand me a warp drive, a psychic alien who disarms people with a neck pinch, a transporter beam, a tractor beam, even a Universal Translator, and I’ll cheerfully settle in with a cry of “There’s coffee in that nebula!” Because I recognize the contract on offer: this is a story, these are the rules, let’s go.


So when I say that Project Hail Mary made me angry — angry enough to fill my Kindle margins with notes that occasionally lapse into all-caps — I want to be precise about the kind of anger. It is not the anger of a pedant. It’s the anger of being handed a contract, one which I took at face value, and the book kept waving in my face… and then blatantly tore up.


Here’s the tell. No one has ever asked me whether the linguistics in Star Wars holds up. (It does NOT. I don’t care.*)1 But several people, knowing I’m a linguist, have asked me, sincerely, expectantly, whether the handling of human-alien linguistic contact in Project Hail Mary is accurate. That question is the entire problem in miniature. The book prompted them to ask it, pretended it could answer it, and then… failed.

What’s the problem, question?

In the book, Rocky’s species (Eridians) communicates via the production of pure tones, like birds. Each “syllable” consists of one to five tones produced at once — ie, a musical chord. Lovely! Dr Grace sets up a simple program to help translate between Eridian and English. I say “program,” but really it’s just a script that hooks up MIDI sound software (such as found in early electronic keyboards) and a spreadsheet that maps Eridian words to English words. Rocky chirps a word, the computer looks up the sound in the spreadsheet, and Grace can hear the word spoken by text-to-speech software. Easy, right?

But that’s not how Rocky talks in practice. He talks like he’s actually learning English. Sometimes he uses “I” and “my”; sometimes he just says “me” or leaves out pronouns entirely. He uses English morphology inconsistently, sometimes saying “thank” instead of “thanks”, sometimes using auxiliaries and sometimes not. If he were speaking his own language, he’d be consistent. (Unless Eridian itself were inconsistent? But why would that be?)


(I’ll just note in passing that Weir lost a great opportunity when he decided that Rocky would be referred to as “he”, even though Eridians do not have biological gender. It feels like that decision was made in the 1970s. There’s no reason, today, not to use “they.”)


And Rocky sometimes says “Mmm” when he’s annoyed. Is there some Eridian word or sound that Grace has carefully mapped to the human sound “Mmm”? That seems unlikely.


As for Rocky’s most common verbal tic, the “question” he tacks on to every question — this is not crazy. After all, plenty of human languages (like Japanese) have an interrogative particle they tack on to the end of questions. What’s odd is that Rocky persists in using this even when he seems to have mastered English do-support, as in “What do you mean, question?”. English indicates a question here with a combination of a wh-word (“what”) and the auxiliary “do”. It’s hard to imagine that Eridian has a word equivalent to “do” here; English is one of the very few languages in the world that do this. Much more reasonably, Rocky would say “You mean what, question?” or “What you mean, question?”.


And then there’s this exchange:
Grace: “We need to talk about mass.”
Rocky: “Yes. Kilogram.”
Grace: “Right. How do I tell you about a kilogram?”


Rocky then suggests using a ball. Rocky provides a ball he knows the weight of, Grace weighs it and tells him how many kilograms it is, and then Rocky knows how to convert kilograms to Eridian units. Simple!

Except this exchange is impossible. Remember that when Rocky says, “Yes. Kilogram.”, he’s not saying “kilogram”, he’s using some Eridian word, which Grace’s program is converting to “kilogram”. But how could this be? Rocky doesn’t have a word for kilogram! He’s got a word that indicates some measure of mass, but it’s not “kilogram”. And he can’t be literally saying “kilogram”, either, because he can only sing pure notes, not form English words. So what’s going on?

The only reason Weir would have written this dialogue is if he had it in his head that Rocky was learning English. Rocky, perhaps, had heard the word “kilogram” and knew it referred to mass, but didn’t know how much. Then he might reasonably say something like, “Yes, kilogram. But how much is a kilogram?”

But Rocky isn’t learning to speak English. He can only speak Eridian.

So, somewhere in drafting, “alien speech filtered through a translation box” quietly became “cute foreigner practicing his English,” and Weir never noticed the swap. It’s not an unreasonable mistake to make — certainly not in the first draft. I made the same mistake once or twice while writing Crown of Crows: in most of the book, the main characters are speaking Artírin, not English, but I’ve “translated” it to English throughout (as Tolkien did with Westron in the Lord of the Rings). But as I was writing along, I forgot that layer sometimes, and had characters talk about their own speech as if they were speaking English. Fortunately I caught those errors the second time through. And it was important that I did, because part of my contract with the readers of Crown of Crows was that it was a worldbuilding-lovers book, with rich cultures, evocative names, and realistic societies and landscapes (once you accept talking animals, of course). If I’d made that implicit promise and then failed my side of the bargain, I’d be failing my readers.

And that’s what Weir is doing here. This is supposed to be hard science fiction. That’s why we spend pages talking about centrifugal forces and neutrinos and the physics of interstellar travel. The joy of the book, the promise of the premise, is a fun adventure (what would meeting aliens actually be like?) with accurate science.


And linguistics is a science.

Darmok and Jalad at Tanagra

Add glowing universal translator device between figures

So Rocky is left speaking stylized non-native English, the type used in Hollywood to show that someone is endearingly trying to learn the language. The effect of this, intentional or not, is to reduce him in status relative to Grace, making him more of a friendly sidekick than an equal partner.


Weir could have created a much more strange and beautiful language for the Eridians. After all, they have perfect memory, and can think and do calculations incredibly quickly. No one is sure what kind of effect that would have on language, but whatever it is, it would be much more complex than the painfully simple system Weir gives them.

For example: if you had infinite memory, how many words would your language have? Adults generally use about 20K words, and can recognize about 20K more. On top of that sits an open-ended pile of proper names (every person, place, brand, and character you can think of), which no one counts and which is hard to define. Eridians might be able to handle millions of words. And most of those may actually be proper names — words used for just one object, rather than a class of objects. Why say “bedside table” again and again when you can name it “Steve” and remember that name forever? Their language may be more like the language in the famous Star Trek episode “Darmok,” full of obscure near-untranslatable names and references to a huge cultural database of memorized stories.


Or think of it this way. Every language sits on a tradeoff between ease of production and ease of comprehension. Computer languages are simple and unambiguous, not because they need to be elegant, but because computers (or, at least, compilers) are not smart enough to handle ambiguity. Human languages are complex and ambiguous because we’re smart enough to deal with it, and the ambiguity reduces the effort being asked of the speaker. A species that never forgets and parses perfectly should drift toward more ambiguity and complexity. Freed from both the cost of remembering and the cost of understanding, Eridian could balloon in vocabulary and slacken into ambiguity at the same time.


And yet, in Weir’s telling, it’s humans that have the difficult language, and Eridians that have, well, MIDI BASIC.

Perfect Pitch

Let’s return to the Eridian sound system. Pure tones, up to five at once. This is already vastly simpler than the range of sounds a human vocal tract can make, but it gets stranger the closer you look.


First, they use octaves. This isn’t wholly crazy; as Weir points out, some animals besides humans treat the octave as a psychologically salient grouping. But many — birds among them — do not, tending to hear absolute pitch rather than octave equivalence. So it’s a little odd that Eridians and humans happen to share this particular intuition, and it would be nice to have a reason.


Second, they use only one octave semantically. They can sing higher or lower, but those registers carry only emotion, never lexical meaning. Which is to say: Eridian uses pitch-height suprasegmentally, for affect and pragmatics, while reserving the within-octave chords for the actual words. That’s a nice parallel to how human language divides labor between segments and prosody. It’s also a remarkable coincidence, and Weir uses it without seeming to notice.


Third, and this is the one that should set off alarms: Grace transcribes Rocky’s speech with off-the-shelf MIDI. For that to work, Rocky’s pitches have to land on the human grid.


Now, MIDI assumes twelve pitch classes per octave, spaced in equal temperament, anchored to a reference pitch of A = 440 Hz. So the coincidence isn’t just “Eridians are musical.” It’s that an independently evolved species, on a different world, arrived at (1) twelve divisions of the octave, (2) equal spacing between them, and (3) our concert-pitch reference, closely enough that a synthesizer built for human keyboards can read them off without complaint.


Take any one of those three and the odds are already absurd. Take the middle one and it’s worse, because equal temperament isn’t a natural fact about sound — it’s a human compromise. Dividing the octave into twelve mathematically identical steps puts every interval except the octave slightly out of tune, and we accept that cost for one specific payoff: the freedom to change key, to modulate, without retuning. It allows all twelve keys to be played on one fixed instrument. It’s great, but the Eridians, confining themselves to a single octave and never modulating anywhere, simply do not have that problem.


Weir has given them our music, including the parts that only make sense as fixes for human problems.


And what does that allow for the vocabulary? Twelve pitches with chords of one to five tones gives Eridian roughly fifteen hundred possible syllables. That’s not unreasonably small; it dwarfs Hawaiian’s inventory, though it’s well short of English’s. But recall what we said about that eidetic-memory: potentially millions of words, a fat tail of proper names for individual objects. Fifteen hundred syllables cannot label millions of words without making most of those words long, many syllables each, and many of them almost identical, differing by a tone or two. So: long words, packed with fine distinctions, sung fast, in a system that may not even resolve fine distinctions cleanly.


The lexicon Weir’s premise demands and the phonology Weir actually wrote are quietly at war, and he never stages the battle.

Adult Swim

But none of that matters, really, because Rocky couldn’t hear Grace at all.


Go to a swimming pool. Pick a crowded one, with kids screaming everywhere. Now duck your head underwater. Do the voices get louder? Quite the opposite: everything is muffled and peaceful. Ah, calm tranquility! Now imagine that the pool has a sheet of hard plastic over the surface. You will now drown, but you will do so in blessed silence.


This is what is happening with Rocky and Grace. Rocky’s atmosphere is 29 times as thick as Earth’s at sea level, while Grace’s ship runs at less than half an atmosphere. The mismatch between their two media is enormous, far larger even than the pool surface suggests. That difference in density creates a barrier that will block almost all sound transmission. On top of that, there is an extremely strong and rigid xenonite wall between them, which prevents Rocky’s atmosphere from exploding outward and instantly deep-frying Grace in superheated ammonia.


It’s not a matter of how well Rocky can hear. Obviously he can hear very well. But the medium of Earth’s air is so thin, its molecules literally cannot hammer hard enough against the xenonite, much less the ammonia, to carry any information at all through to Rocky’s ears. Meanwhile, Rocky’s voice will pound against the xenonite with great force, making his songs boom into Grace’s ears like deep otherworldly whalesong.
If Grace shouts, or perhaps places his hand against the xenonite when he speaks, Rocky might pick up some of his voice. But Rocky certainly can’t “see” him well, and simply would not be able to make out how many fingers he’s holding up.


Now, Grace is careful to note that Rocky doesn’t use echolocation (emitting high-pitched sounds, like bats), but instead employs passive sonar, picking up objects by listening to sounds emitted or scattered already in the environment. If Rocky did use echolocation, emitting extremely high-pitched, high-energy sounds that would carry well through his atmosphere, the xenonite barrier, Earth’s atmosphere, and back, he would have a better chance of “seeing” Grace. But regular sounds? No chance.

I’m afraid I can’t do that, Dave

a brown speaker on the table
Photo by Anete Lusina on Pexels.com

Almost as jarring as hearing Rocky speak was hearing Grace’s computer speak.


First off, it’s very unclear when exactly this book is supposed to take place. Some things suggest it’s in the future (he mentions “space stations” with “robots” at one point), and others suggest it’s in the present (the ISS is the only space station actually mentioned) or even the past (somehow Excel is the best spreadsheet program, and SpaceX is never mentioned as a viable spacecraft option?).


So let’s suppose it’s set in 2020, when the book was likely written. Instead of COVID (which was clearly never a thing, given the fact that Grace has to be flown supersonically around the world to go to meetings, instead of using videoconferencing) we have astrophage. Fair enough.


Then what’s going on with the computer’s voice systems? I was on Amazon’s Alexa team between 2013 and 2018. I would have been embarrassed to ship such a bare-bones, poorly-designed, awkward speech interface. It doesn’t understand his questions half the time, and when it doesn’t understand, it just repeats itself instead of saying “I don’t understand.” It can offer very little actual information, and it explains things not at all.


You could say, well, they were cutting corners. Fine. But the ship has essentially unlimited memory and computing power. And the voice interface system is the primary, indeed the only shown, interface with the coma / medical system. It is essential that this system be powerful, flexible, informative, and accurate.


The Hail Mary carries computing and storage on a scale Grace marvels at elsewhere in the book — effectively the sum of human knowledge, packed in for the voyage. It was built over years, by a mobilized planet, for the most important mission in history. And the conversational interface it gives its lone survivor is dumber than the voice assistant that was, at that very moment, sitting on Weir’s own desk telling him the weather.
I don’t know why Weir made this choice. But it’s just another case of breaking the contract with the reader.

Contractual Obligations

I keep coming back to Arrival, because it’s a good mirror for what went wrong here.
Arrival takes a linguistic liberty far wilder than anything in Project Hail Mary. It asks you to believe that learning an alien language unlatches your perception of time itself. This is Sapir-Whorf cranked past any defensible reading of the evidence. Sure, the language you use influences how you think, a little, at the margins. It cannot unlock whole new realms of time and space. (As far as we know.)


But that’s fine! And the reason is the contract with the reader. Arrival never pretends its big idea is rigorous. What it takes seriously is the work — the patient, fumbling, thrilling labor of building shared reference from nothing, of discovering that you can’t even ask “what is your purpose” until you’ve co-invented the words for you, purpose, and question. It signs a contract that says: first contact is a mountain, and watching someone climb it is the story. And it honors that contract in every scene.


Project Hail Mary signs the opposite contract: the science here is real, my narrator checks everything. And then jogs over the mountain in a week. “After a week of honing our language skills, Rocky and I are ready to have a real conversation.” A week! The thing Arrival treats as the labor of a lifetime, Weir treats as a montage. That sentence is the whole problem in fourteen words: not that the linguistics is hard and he got it wrong, but that he never noticed it was hard at all.


But did he not notice? Or did he choose not to notice?


There are places where he lavishes his obvious, infectious delight in getting-it-right on orbital mechanics and relativity and the physics of spin. He spends pages on these things. But even in these cases, there are odd mistakes, oversights, and misrepresentations. Cases where it seems like Weir should care enough to get it right, but it seems like he doesn’t. Which star is actually closest to the Earth? Is there carbon on the sun? How cold was the last ice age? How thick does a hull have to be to survive near-light-speed travel? And even more telling: how does the scientific community work? How do societies react to potential disaster? How do people grieve?


There’s a stranger, better novel standing right behind this one, the version where the care got spread evenly. I’d have loved that book. But there are cracks in the foundation, and they run through every wall of the house — the physics, the biology, the economics, the politics, the basic question of how anyone in this world makes a decision. So in the next post, let’s tour the rest of it.

  1. Would Star Wars be better if its linguistics were rigorous? Of course! It would be a much more immersive and self-consistent world. The names wouldn’t sound like they were trying to be Spoonerisms and failing.

    Or Sporkerisms, as Alison suggested. ↩︎

Leave a Reply