Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I telecommute every Wednesday and attend a lot of meetings in a global company virtually. If every attendee is telecommuting, it works well. If some are physically present, sotto voce comments are my nemesis. Folks in the room can hear them just fine, and frequently they are hilarious (judging by the laughter). For the distant attendee, the microphones do not work well enough yet. Once that is tidied up, I can imagine much more adoption of no-office corporations.


I'm not sure what the problem with microphones / voice transmission is in 2019.

Using Zoom and other such platforms (across different companies, computers, meeting rooms) the result is almost always a distorted, low bit, overly compressed sound. Plus crazy 2-5 sec latency in some cases.

One would expect this to be solved in an age we stream 4K movies...


As someone who worked on videoconferencing for a while, there are four answers: multiple microphones, echo cancellation, background noise, and network issues.

If everyone in a conference room wore individual headsets it would be great. But when you have 3 different mics along the table all of which are picking up different audio signals (to make sure people at both ends can be heard), you need to subtract background noise of construction and hallway chit-chat and people typing on their laptops, and also subtract everything coming out of the speakers (which has different delays in each microphone and is distorted by both the speaker response and microphone response)... and then it's 2:30 and your office upload link is saturated because everyone's starting a new remote meeting uploading HD video simultaneously?

That's why.

But the main culprits behind the "distorted, overly compressed" sound are echo cancellation and then noise cancellation. It's insanely hard. It's actually not low-bit at all, it just winds up sounding like that in the end. (Play with a noise removal filter in Audacity and you'll realize it will produce similar-sounding audio.)


The true solution to remote conferencing is, everybody stays at their desk. A group in a conference room loses so much for everybody - others have trouble with who's talking, faces are tiny and far away, noise cancellation in a big room is hellacious.

Stay at your desk and join the meeting. Everybody can share a document, look at any document share whenever they like, look straight at the camera with good focus, sound wonderful with a headset mic. Everybody knows who's talking instantly, with their name associated with the voice and face.

The 'conference room' idea is a terrible one, and should quietly die.


Everyone always talks about videoconferencing and cameras but do you really need to see faces?

At my company we just dropped the cameras. We can still share screen if needed but we didn't gain much of value from seeing faces and the occasional bandwidth issues mostly vanished. I don't think anyone misses the video.


> do you really need to see faces?

The research I've seen points to a huge, unambiguous, flashing YES.

The fact is that a majority of communication is nonverbal and emotional. And this is just as true in business meetings as it is in your personal relationships.

Video allows us to better understand when someone is done talking and we can speak without interrupting. It allows us to better understand if someone is being intentionally disrespectful or merely clueless. It lets us see if someone is deeply concerned or merely mildly interested. It lets us see if someone is being silent because they're zoning out or because they're furious but resigned.

All these cues (and hundreds more) allow conversations to 1) proceed far more smoothly and efficiently, packing more productivity into a single meeting, and 2) avoid misunderstandings which can be affect both the material outcomes of meetings as well as damage interpersonal relationships.

Many people think video is unimportant, but that's because when it's lacking, we make assumptions about people on the other end for lack of evidence -- assumptions that can turn out to be wildly untrue. Because we have no immediate evidence to the contrary, we assume we're not missing anything. This is why research is so important -- it shows that plenty of normal human communication tasks often perform far better with video, and more frequently fail without it.


The cues only work if they're delivered in a timely manner, though. I would argue highly compressed and delayed video is actually worse and more frustrating than having no video at all.

This is one of the few arguments against remote work I actually agree with: streaming video technology still isn't good enough to replace in person haptic communication.


As its done now, yes streaming video is problematic. I worked at Sococo, and back when we made our own solution we had good low-latency video. Careful bandwidth control and stream-sharing did a lot to make it more useful.


In your opinion, is there still space for a competing low-latency video teleconferencing application in the market? None of the apps I've tried get video right. I haven't heard of Sococo before, though. I'll definitely check them out.


Maybe, for some people, but definitely not me. Maybe I'm just weird, but I hate video conferencing so much. It makes me feel uneasy and being watched. When I'm in audio only mode I can express myself much better. And this is not just a "getting used to it" thing. I've been doing it for more than a year now.

I think video should be opt-in, but unfortunately I'm forced to turn on the camera at work.


If you know everyone well, sure. But often videoconferencing in larger institutions involves at least some strangers.


Does it really matter?

I've had call conferences with people I've never seen, and it really hasn't been a problem.


With a group in a conference room, it can be hard to tell who's talking. Because you don't know their voice. Or there are several people, and you don't know which new voice is who's.


Oh yeah, conference rooms are annoying (for multiple reasons). We don't really do that, everyone participates using their own device, and the tools then show who's speaking.


... so? I work in an org that's spread across the US. I work with people on the opposite side of the country who I've never met in person and I have no idea what they look like. I also don't care.


You don't care who's talking?


I don't care what they look like. After talking to them once or twice on the phone, they're completely recognizable by voice alone, even on our terribly antiquated conferencing system.


I had a guy I've worked with for the past year finally put a profile picture on his account, and I saw his face for the first time. Was a bit of a shock, since it wasn't what I pictured from his voice. But it's no big deal I pictured someone that looked a bit different.

I barely think about what people look like now at my job, even the people I used to work with every day in person (when we still had an office); they're mostly just distinctive voices to me now.


If the video works well, I think it can be useful, especially when there's only a few people per camera -- watching people's (well synchronized) faces while they speak helps with comprehension, and watching faces while you speak helps confirm if what you said was understood. If the video isn't working well, it's definitely distracting and not very useful.


I'm honestly flabbergasted that so many people on HN take video as a given. I don't even have a webcam at work, and I have no desire at all to get one.


Also, some of the conferencing tools take it for granted. My work laptop has a camera, and it always activates by default when we start a call, and I have to manually turn the video feed off every time. Browser asks for permission to use the camera, but if I deny it, the application refuses to work at all!


Except that then I'm surrounded at my desk trying to concentrate and work while everyone's talking on their headsets around me. Ugh.

No: if you want to be part of a meeting, everybody should be going to conference rooms or phone booths.


Everyone should have an office. If you go remote, that kinda becomes affordable.


No kinda about it. The cost savings is huge. I'm cheering for the bean counters to overrule the PHBs on this one.


A lot of people spend a large chunk of their days on the phone. Having to find a room for every call isn't practical. Companies should make an effort to segregate roles that involve lots of phone time with roles that tend to value quiet. But unless a company is giving everyone an office it's probably not reasonable to expect people not to be talking around you.


One of the teams I work with has a rule that everyone is in a conference room together or everyone is on their own laptop.

It's a pretty good rule though it's probably harder to convince people to follow it if it's a single person who is somewhere else.


It would seem that much of this problem could be solved with microphone arrays and software. Bandwidth isn't that much a limitation, but latency is.

I'm not sure of the state of the art in 3d positioned sound in a data format, but you could do that pretty easily.

I thought of a startup back in 2016 for selling auto-transcripting, voice-signature tracking, meeting logging microphone arrays for businesses. Never had the capital to do it.


Beamforming is actually already implemented in commercial products (both for microphones and speakers). You can look at the phones from a company called Invoxia. I know voice-signature tracking was also in R&D a few years ago so it might be available too


> much of this problem could be solved with microphone arrays and software

That's how it's done already. Beamforming mics have been around for a few years. They're definitely better, but still far from perfect. Because signal processing just isn't perfect. Incredibly smart engineers are doing their best on this stuff.


Was just in a Zoom call that covered US, Europe and Japan. Didn't notice any delays and sound was clear. We were using screen sharing but not video.


This is absolutely incredible, unless it was a very presentational meeting. I can't remember ever having an overseas conversation without there being perceptible delay, regardless of transmission medium.

You need under 20ms latency to be truly imperceptible, and probably under 150ms to avoid scenarios where people are often talking over each other. I wonder what their routes look like to make this possible?


It's pretty much not possible. Their perception of delay is just not that sensitive.

A transcontinental link might just be fast enough for 150ms, but stack on that all the buffering, codec delay, etc. and it's night impossible to have imperceptible delay.


I'm not saying that there were no delays, just that they didn't stand out as interfering with the discussion.

This is a group that has worked together for a long time, 30 years for some of us. I'm guessing there is an element of knowing each other's speech patterns from talking face-to-face that will help with knowing when to contribute.


bandwidth != latency


Yes, in particular, you can't just buffer ahead with a voice call, which is something you can do with a Netflix stream.


They can suffer orthogonally, but when it comes to realtime, how much bandwidth you have compared to the data you want to carry absolutely matters for the latency you'll see.

If a 10K "hello" voice sample (that takes 2 secs to be spoken) needs to pass through a 1Kbps bandwidth line, then it would have lotsa latency.

In any case, we stream 4K movies with not much buffering (as evident by the ease of skipping ahead at random points). One would expect better audio latency based on that.

And we also stream 4K video games as video streams (or Google claims so).


To be fair, latency isn't a problem at all for streaming 4K movies.


The funny part is that mic-ing a room is a solved problem. It just doesn't look invisible enough for "enterprise" deployments.


Can you elaborate on that?


A typical boardroom running remote video chats in any corp. buildings I've been in usually get outfitted by Cisco or someone and want something like this that ties everything together in a nice, neat little package with terrible microphones:

https://external-content.duckduckgo.com/iu/?u=http%3A%2F%2Fw...

A lot of pro-audio companies also make boardroom devices but there are huge compromises because of the form factor—usually companies want their audio to be as invisible as possible.

That and acoustic treatments. So many "teleconference" rooms are half glass. That's going to add extra noise. Hell for a 20-person meeting you might do a lot better with a few SM57's slung from the ceiling or standing on the desk and a few baffles.

Pitching that would probably be disappointing, though.


It's a no-sell because in-person offices are still the main push and people want to be able to enjoy the slick meeting room thing. If remote work was pushed to the fore / was the main effort, I think they'd care less.


Big giant plus one on this whole comment




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: