Hearing it like I’m Seeing it

Photo by Volodymyr Hryshchenko on Unsplash

Imagine that it is 11.30am. It is almost lunch break. You are in a meeting with 20 other colleagues with your boss, and your stomach is grumbling. Your colleague sitting opposite you mouths the sentence, “What do you want for lunch?” while pointing to her clock. You mouth back, “Pizza ok?”. Your colleague nods. 

Back at home, you are with your child. It is 7.30pm. Your child is watching the newest episode of Blue’s Clues on the television at full volume. He looks at you and you ask, “What do you want for dinner?”. Because your voice is muffled from the television, your child looks at you blankly, not knowing what you said. 

Why does this happen? How did your colleague know exactly what you said just by mouthing, but not your child? A simple answer is: Multisensory (visual-speech) processing

Wait, did you say pea or pee? In essence, visual-speech processing involves the integration of information of what you hear and see at the same time. A relatively large field of research stemming from the “McGurk Effect”*, this has been known to explain why sometimes you mishear words based on how someone mouths it at the same time. Seeing the movement of the lips and mouth helps to facilitate what you hear. 

Developmentally, there are differences in the way you and your child would integrate such information with varying levels of integration such as:

  1. Telling you when the word starts (when did the mouth start to open?)
  2. Telling you what sounds the person is trying to make (did the lips round to make an “oo” sound?)
  3. Telling you the actual word that the speaker said based on what you expect (did my colleague say pairs of sunglasses or pears of sunglasses?)

This integration really works in your brain. In a recent research study, 2 groups of children (aged 8-9, and 11-12) were compared with adults (aged 18-37) in an audiovisual task to investigate what happens in the brain when matching the visualization of mouth shapes and sounds together, also known as the Speech-in-Noise perception task.

An example of a trial in the Speech-in-Noise perception task

What they found was quite interesting. Firstly, they found that younger children performed slightly poorer and reacted slower compared to their older children and adult counterparts. Secondly, brain waves that were responsible for this integration between the visualization of mouth shapes and sounds heard (also called the N400) contributed to the higher accuracy in adults. Conversely, while the group with older children also performed well, the brain waves that contributed to their performance were different (also called the late positive complex, or LPC). 

Wait… so will my child know what I’m saying even? Not to fret. Your child knows what you’re saying, but they just use a different mechanism to do so. Compared to adults, children just need to see the entire word articulation more carefully from the face to properly match the word they hear. Suggestibly, these brain mechanisms mature over time such that when children are older, they can match what they see to what they hear more efficiently. As time goes on, children can also best predict what you (or their friends) are trying to say when you need to be quiet and can only mouth your sentences to them. 

This post written by our intern Cameron and edited by our research fellow Rui Qi.

Reference: 

Kaganovich, N., & Ancel, E. (2019). Different neural processes underlie visual speech perception in school-age children and adults: An event-related potentials study. Journal of Experimental Child Psychology, 184, 98-122. https://doi.org/10.1016/j.jecp.2019.03.009

You might be also interested in this: 
*About the McGurk Effect: McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746-748. https://doi.org/10.1038/264746a0