Want to really understand how other people are feeling? Close your eyes and listen.

That’s the takeaway of a new study published in American Psychologist that explored the empathic accuracy of various forms of communication. The results are some of the first to demonstrate that the primary way we convey emotions may be through the voice – not facial expressions or body language, as previously thought.

“Humans are actually remarkably good at using many of their senses for conveying emotions, but emotion research historically is focused almost exclusively on the facial expressions,” said Michael Kraus, a social psychologist at Yale University and author of the study, to The Guardian.

The paper detailed several experiments. In the first, researchers asked online participants to view videos showing a group of friends teasing each other about a nickname. Participants were presented the scene in one of three ways – audio only, audio and video, and video only – and were then asked to interpret what the friends were feeling by rating emotions like amusement, embarrassment, or happiness on a scale of 0 to 8. Surprisingly, those who only heard the interaction – but didn’t watch the video – were best able to interpret the emotions of the scene.


Another study involved undergraduate students gathering together in a room to discuss their favorite TV shows, movies, food and beverages. One group had the conversation in a lighted room, the other in a darkened room. Similar to the first experiment, people who were visually impaired in the darkened room more accurately interpreted the emotions of others.

Finally, the researchers took audio from the first experiment in which friends were teasing each other and had participants listen to one of two versions: the actual dialog from the friends, or a computerized voice reading the exact same words. Although you might expect to glean a similar amount of emotional information from the words alone, participants who interpreted the scene by listening to the digital voice fared far worse at interpreting emotions.

“The difference between emotional information in voice-only communication by a computer versus a human voice was the largest across all studies,” Kraus said to Yale Insights. “It’s really how you speak—not just what you say—that matters for conveying emotion.”

It seems intuitive that more information – both audio and visual – would better equip you to read the minds of other people, but the opposite seems true.


One explanation has to do with the limits of our cognitive power. When we’re taking in complex audio and visual input, it takes our brains more effort to process information. It’s similar to how a computer slows down when you have a bunch of different programs running simultaneously. Visual information is particularly costly to process, as Art Markman notes for Psychology Today:

Quite a bit of the brain is taken up with understanding what is going on in our sensory world. For example, if you clasp your hands behind your head, most of the area taken up by your hands reflects the amount of the brain that is devoted to making sense of the information coming in through your eyes.

These same brain regions are also responsible for recalling visual memory. And that could explain why people tend to shut their eyes when trying to recall details, or solve complex tasks in general. A 2011 paper published in the journal Memory & Cognition illustrates this idea quite nicely.

For the study, participants were instructed to watch a bit of a TV show, and later were asked to recall details about what occurred in the episode. The researchers separated participants into four groups, asking each to recall the show while they either: stared at a blank computer screen, closed their eyes, watched a computer screen as it randomly displayed nonsense images, or stared at a blank computer screen while they heard spoken words in a strange language.

digital-natives-emotionally-stunted

Like the recent study, the groups that had received the least visual information – that is, they closed their eyes or stared at a blank computer screen – performed best. Interestingly, the group that stared at the screen displaying weird images fared worst at recalling visual details, while the group that heard random bits of a strange language did worst at recalling audio details from the show.

The other possible explanation has darker implications. People have a natural tendency to disguise their emotions, whether they’re doing something as benign as forcing a smile when you’re feeling down at work or something as malicious trying to manipulate someone into a shady business deal. Because our voices seem to be the primary way we communicate our emotions, the addition of visual cues like body language and facial expressions adds a whole toolset people can use to disguise their true emotions – a deliberately thoughtful tilt of the head, a raise of the eyebrows, or any of those body language hacks written up in countless articles ever since that one TED talk.

Either way, the researchers suggest people pay more attention to what others are saying and how they’re saying it.

“There’s an opportunity here to boost your listening skills to work more effectively across cultures and demographic characteristics,” Kraus said. “Understanding other people’s intentions is foundational to success in the global and diverse business environment that characterizes both the present and the future.”

alan-alda-we-were-built-to-connect-with-other-people-heres-how-to-be-better-at-it