The brain processes speech in parallel with other sounds

l33tbro · on Oct 25, 2021

Probably the most brilliant contributor to the art of sound design is a guy called Walter Murch. He's somewhat of a polymath in film, as he not only designed the sound for The Godather trilogy and Apocalypse Now, but he also edited the picture.

Anyway, Murch put a lot of his thoughts about both editing and sound into a book called 'In the Blink of an Eye', which is still read by film students today. One passage that always stuck with me is an informal rule that Murch gave himself after he discovered something odd about the interplay between sound and audience immersion. After a lot of experimentaiton, Murch found that in any 'moment' of a film (say 3 - 7 seconds), an audience can only process 2-3 layers of sound playing at once.

For example, Michael Corleone goes to meet Moe Greene and they're walking through the lobby. We hear footsteps, the elevator ding, and the atmos of the hotel. If Murch had added the sound of he luggage of the bell-boys or some guests arguing in the rooms, it would be too overwhelming for us and the verisimilitude of the film would become compromised.

I guess I'm mentioning Murch because this informal sound rule was always independent of speech from the actors, which he doesn't treat as a layer of sound. To me, this may be a very practical example of what these researchers seem to be finding: that there is indeed some parallel processing going on with speech and sound in our auditory cortex.

austinjp · on Oct 25, 2021

I'm often curious about the audio approach used in a bunch of 1960s films where multiple people are talking over each other. It's very deliberate, but I don't know the terminology. The Graduate is an example, I think in the party scenes near the beginning. In that example, it seems intended to convey the feeling of being overwhelmed. But there are films and TV where it's less about being overwhelmed and more about an immersive experience, to conjour a bustling environment like being at a busy family meal. I think M * A * S * H did it occasionally in the surgery scenes.

I always find it very noticeable when it I see it. There's something jarring about it, the experience is somehow very different from a real multiparty conversation, possibly because it's impossible (for me) to focus my attention on any one speaker. That may be deliberate, of course, although often these scenes include key information in the overlapping dialogues.

It's a technique I never see in modern film/TV.

mjburgess · on Oct 25, 2021

Consider visually processing (counting at a glance):

X X X X

and

X X X X X

How many are in the first? How many in the second?

It's my rule of thumb that all human pre-attentive processing can handle 4s at the pre-attentive level, and 5s+ require cognitive engagement.

This sound rule here seems to be the same: voice + 3.

MauranKilom · on Oct 25, 2021

The "counting low numbers of things instantly" thing has a name: Subitizing. It's quite the fascinating topic.

https://en.wikipedia.org/wiki/Subitizing

Also, it's apparently not restricted to counting:

> A 2006 study demonstrated that subitizing and counting are not restricted to visual perception, but also extend to tactile perception, when observers had to name the number of stimulated fingertips.[7] A 2008 study also demonstrated subitizing and counting in auditory perception.[8] Even though the existence of subitizing in tactile perception has been questioned,[9] this effect has been replicated many times and can be therefore considered as robust.[10][11][12] The subitizing effect has also been obtained in tactile perception with congenitally blind adults.[13] Together, these findings support the idea that subitizing is a general perceptual mechanism extending to auditory and tactile processing.

uuddlrlr · on Oct 25, 2021

I learned in a documentary that pople who play some video games can immediately recognize 5-7. They tested with a handful of grapes.

I can reliably recognize up to 6 in most cases.

lindseymysse · on Oct 25, 2021

That is really interesting. I added a comment below: https://news.ycombinator.com/item?id=28988027

I wonder if that has something to do with it

mettamage · on Oct 25, 2021

It has a pretty good graphics interpretation engine too. But it's not only the brain that's awesome. The cameras of the human body are also of quite good quality. I can't quite understand yet what the sensors on the tongue are made of. Nevertheless, it really makes this whole "existence on earth" thing feel quite real!

Sometimes I have to take a step back and not feel like I'm reading into hardware specs when I see titles like these. But then it becomes so fun to pretend that this is all just a hardware sleeve for our own spirits/souls [1].

[1] Pretending they do exist is quite fun :)

cantcopy · on Oct 25, 2021

When ambient sounds are too loud, I can’t understand what’s being said. While I can see that people around me are communicating easily. In the same way, lyrics are very hard for me to get.

Someday this research might people with the same issues.

enobrev · on Oct 25, 2021

I started to have the same issue about 10 years ago. Suddenly bars and crowded places became far less enjoyable for me because I had such a hard time keeping up with conversations. I had my hearing checked and while it wasn't perfect, it was just normal for my age.

I remember reading that it had to do with how I processed language and sounds, and that there was no real cure yet as a few years ago.

The last couple years has made it far less of a problem but as we start to return to normal (eventually) I'd love to figure it out. I tend to feel bad sitting through a conversation where I'm only hearing about a third of what others are saying.

JoeyJoJoJr · on Oct 25, 2021

Last year I became aware of how much difficulty I have processing speech. I think this is something I have always had. I could never understand lyrics in music, have trouble following discussions in meetings, and struggle understanding people with accents. I wonder if I have learned to just respond instinctively to “vibes” instead of actually properly processing the content of what is being said.

I suspect it is largely anxiety related. I’ve had moments where lyrics are crystal clear, after smoking weed, so I don’t think it is a hearing thing.

davidandgoliath · on Oct 25, 2021

Not a doctor, but you can take audio/visual processing tests to get a better idea of what's going on. I took one during a barrage of ADHD related tests, it was interesting to say the least.

It could very much have an anxiety or ADHD element to it.

enobrev · on Oct 25, 2021

I feel like I'd been ok, previously. I've spent a lot of time in loud places for most of my life. It's definitely possible that I simply didn't recognize the issue.

Interesting point about music. I'm terrible at remembering the words to songs, but just fine with melody and rhythm.

limograf · on Oct 25, 2021

https://www.nhs.uk/conditions/auditory-processing-disorder/

Get a full audiogram and also ask them to look for cookie bite hearing loss. It's fixable.

pjc50 · on Oct 25, 2021

> It's fixable.

Link says:

>> There is no cure for APD but there are things that can help.

I'm fairly sure I have this on top of regular hearing impairment. I've actually been through the NHS process and seen the graph that shows dramatic rolloff in my hearing above about 6kHz, and have a pair of NHS hearing aids.

I've not worn them since the start of the pandemic because I've not been in a crowded space with lots of people talking.

(Interestingly, because videoconferencing technology copes badly with multiple speakers and very rarely has proper working spatial audio, suddenly people with normal hearing can't disambiguate speakers either and everyone ends up rigidly taking turns)

limograf · on Oct 25, 2021

Cookie bite hearing loss is more or less fixable, not CAPD. If you have both (which OP sounded like they did, or at least sounded like it was worth investigating) then you can fix it up a lot with targeted open dome hearing aids.

8bitsrule · on Oct 26, 2021

I recall as a kid that I might hear a pop song on the radio for years without even -noticing- the lyrics (or the artist for that matter). It seems that, while I hear it all, what I'm listening to - is what I'm most attracted to. (I usually did listen to the funny lyrics of novelty singles - often backed by pretty silly music.)

I suspect that, for most of us, unless we get consciously involved, what we're most attracted to is the most pleasing or exciting or interesting 'channel'. (In the case of pop music, maybe we're more enchanted by the singer - or the 'message' - and the music be damned.)

In much non-vocal, 'purely' orchestral music, the melody is the attention getter, and all the other notes might be treated as 'just the support group'. Some conductors work hard to bring out the best parts of that support. (A lot of weaker melodies are strongly dependent on how they're harmonized.)

Tomte · on Oct 25, 2021

Even more: speech in the native language is processed differently than non-native languages.

I was a proband in an fMRI study that played German and Polish fricatives (s and sh sounds) and measured the responses of different parts of the brain: https://elib.uni-stuttgart.de/handle/11682/2612

When a native German speaker listens to German fricatives, the same parts of the brain flare up as when a native Polish speaker listens to Polish fricatives.

But when the other language is played, different parts of the brain get active.

tgv · on Oct 25, 2021

Don't attach too much weight to this. First, it's just two phonemes, and very similar ones at that. Second, many different results have been obtained in similar studies. Third, "differently" is a rather trivial word: anything that gets associated with these sounds may or may not "light up" in an fMRI experiment. It's not even sure the effect from that paper are relevant to language processing. In this case, it might be something as simple as familiarity. Preciously little is known about language processing.

verst · on Oct 25, 2021

I wonder whether this effect would occur for languages in which you reached native level proficiency.

IanSanders · on Oct 25, 2021

And for intermediate level languages which were started being learned in the young age.

lindseymysse · on Oct 25, 2021

I wonder if my particular experience is relevant:

I found out about 3 years ago that I write better backwards than forwards (like Leonardo). It is partly mechanical (I'm left handed) but partly something to do with how I process information. My forwards handwriting gets really bad -- I find my backwards handwriting easier to read. I have no problem keeping my letters going in the right direction when I'm writing backwards -- I'm better at it than when writing forwards -- except for numbers.

Numbers are, for whatever reason, very hard to write in the right direction for me (honestly, forwards too), though reasoning about calculus and graphs is far easier backwards.

Anyway, I'm still trying to figure that out. I think it is that our minds are coded to pick out language, so that pattern matching isn't so hard, but less capable of decoding numbers and reasoning about them.

At least for me.

MauranKilom · on Oct 25, 2021

I'm interested what "reasoning about calculus and graphs [...] backwards" means. Could you elaborate?

lindseymysse · on Oct 25, 2021

It's really hard to explain. Drawing graphs and keeping them straight in coordinate space is difficult until I flip the graph.

It's something about processing linear information -- left to right each unit is discreet and separated from everything else, right to left I can reason about it as a smooth motion.

I don't know if this adds any clarity.

JoeAltmaier · on Oct 25, 2021

I thought the brain did everything in parallel? I guess the learning here is, that sound runs through several different processors. In wind I sometimes hear spurious words. I suppose its a hungry speech processor making wild associations with not much to go on. My brother reports hearing music in noise. Not just a or or two; orchestral works just barely heard. So he has some 'music processor' running in there? He's not a musician; anything but.

maicro · on Oct 25, 2021

For reference, that's called pareidolia: https://en.wikipedia.org/wiki/Pareidolia

wombatmobile · on Oct 25, 2021

I thought immediately of film, since film music plays such a key role in communicating the narrative by signalling emotional context.

For the same dialog, a film can convey an entirely different plot intent purely through the accompanying music.

Without the music, it takes longer to decode the dialog.

With the music, even if the words are difficult to decipher the audience knows what is happening.

h2odragon · on Oct 25, 2021

I suspect this isn't confined to humans: dogs cab be aware of what you're saying while they're listening to the breeze.

"C'mon, hound, I hear the coyote howl too but we're not hunting tonight" and they'll slink back reluctantly with the ears perked the whole way.