Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Microphone volume is extremely quiet on built-in MacBook Pro microphone #239

Open
mutexlox-signal opened this issue Oct 18, 2024 · 14 comments

Comments

@mutexlox-signal
Copy link
Contributor

When the default device on an MBP is the built-in microphone, and I configure cubeb with mono input from the default device, the volume drops to an imperceptibly low level (even when the sound is very loud), both for the input cubeb receives and the sound the System Settings pane registers.

Reproduction code using cubeb-rs:

//! libcubeb api/function test. Records from the microphone and plays to the speaker.
extern crate cubeb;

use cubeb::MonoFrame;
use std::thread;
use std::time::Duration;

const SAMPLE_FREQUENCY: u32 = 48_000;
const STREAM_FORMAT: cubeb::SampleFormat = cubeb::SampleFormat::S16NE;

type Frame = MonoFrame<i16>;

use cubeb::Context;

fn main() {
    let ctx = Context::init(Some(c"Cubeb recording example"), None).expect("Failed to create cubeb context");

    println!("using backend {}", ctx.backend_id());

    let params = cubeb::StreamParamsBuilder::new()
        .format(STREAM_FORMAT)
        .rate(SAMPLE_FREQUENCY)
        .channels(1)
        .layout(cubeb::ChannelLayout::MONO)
        .take();

    let mut builder = cubeb::StreamBuilder::<Frame>::new();
    builder
        .name("Cubeb recording (mono)")
        .default_output(&params)
        .default_input(&params)
        .latency(0x1000)
        .data_callback(move |input, output| {
            for (i, x) in input.iter().enumerate() {
                output[i] = *x
            }
            output.len() as isize
        })
        .state_callback(|state| {
            println!("stream {:?}", state);
        });

    let stream = builder.init(&ctx).expect("Failed to create cubeb stream");

    stream.start().unwrap();
    thread::sleep(Duration::from_millis(5000));
    stream.stop().unwrap();
}

I created a repository to simplify reproduction, at https://github.com/mutexlox-signal/test-coreaudio-rs

To reproduce:

$ git clone https://github.com/mozilla/cubeb-rs.git
$ git clone https://github.com/mutexlox-signal/test-coreaudio-rs.git
$ cd test-coreaudio-rs
# select the built-in MacBook Pro microphone as default in system settings
# Talk, observing the indicated sound levels in system settings.
$ cargo run
# Sound may be very quiet, and system settings may show very little sound input
$
# talk again after program finishes -- sound levels in system settings should
# be normal.
# select any other microphone as the default in system settings
$ cargo run
# Sound should be normal -- audible in headphones/speakers and indicated at
# normal volumes in system settings.
@mutexlox-signal
Copy link
Contributor Author

(For the sake of simplicity, I have the output going to the default output device, which in this case is my headphones, so there should not be any echo cancellation effects)

@padenot
Copy link
Collaborator

padenot commented Oct 21, 2024

@Pehrsons should know about this.

@mutexlox-signal, what mbp are you using here? There are lots of bugs or other weirdness that depend on macOS versions and the hardware in use, so it would help to get more info. The backend attempts to workaround most bugs, but maybe there's more.

@mutexlox-signal
Copy link
Contributor Author

It happens on both a Mac14,9 and a Mac15,8 (both on sonoma, 14.6.1), but it does not happen with the older audiounit C++ backend, so I suspect it's not a new bug in the hardware?

@padenot
Copy link
Collaborator

padenot commented Oct 21, 2024

I'm not sure what you're experiencing, so I'll write a few different things here in the hope to provide enough context to characterize the issue.

To make things clear, when you're running this program, is the output low, the input low, or both?

The rust backend can use a widely different API underneath than the older C++ backend, called VoiceProcessingIO (often abbreviated VPIO). This VPIO is, as the name says, optimized for speech-type input output, and allows processing the audio to be much better for speech (echo cancellation, noise suppression, automatic gain adjustment etc. you name it).

It's great but has lots of bugs, and here's the logic to use it or not: https://github.com/mozilla/cubeb-coreaudio-rs/blob/trailblazer/src/backend/mod.rs#L3351-L3358. It depends on the macOS version, the device hooked up, and whether VOICE is passed in. In some cases it has to be force enabled, sometimes force-disabled, I'll let you read the code in the area, and see what applies to you.

Additionally, we're still attempting to understand what is going on in https://bugzilla.mozilla.org/show_bug.cgi?id=1896938 (Firefox's bug tracker), but so far we haven't been successful. As a cubeb API user, you can test various things, such as passing in VOICE for your stream or other bits in the input processing API.

@Pehrsons is the person working on this but is off this week.

@mutexlox-signal
Copy link
Contributor Author

To make things clear, when you're running this program, is the output low, the input low, or both?

Both, but when I select a different default device in system settings (e.g. my webcam's mic) both input and output are normal.

If I recall correctly, I saw the log:

Input device {} is on the VPIO force list because it is built in, and its volume is known to be very low without VPIO whenever VPIO is hooked up to it elsewhere

for this device.

That bugzilla bug does, indeed, seem very relevant, and perhaps even likely the same.

@mutexlox-signal
Copy link
Contributor Author

Correction: other unrelated audio dips (or possibly pauses) momentarily when the stream starts and ends, but while the streams are going the output from other applications is a normal volume; it appears that it's just the input from the built-in microphone that's low

@mutexlox-signal
Copy link
Contributor Author

mutexlox-signal commented Oct 21, 2024

Some arbitrary changes/experiments:

  • If the input format is Float32NE, the behavior is similar, but not identical: I hear staticy pops, but the observed input levels in System Settings remain low-to-zero.
  • If I add the VOICE preference, it makes no difference.
  • Creating the stream with STEREO and 2 channels fails with InvalidParameter
  • Changing the latency from 0x1000 to 480 makes no difference
  • Explicitly requesting the audiounit backend when creating the cubeb ctx does fix the problem (for S16NE)
  • Empirically, observing the maximum input level the data callback sees with audiounit-rust, I see an value of 567. With audiounit (in the same environment with similar sound levels), I see a value of 26,914. (For S16NE)
  • For Float32Ne, with audiounit-rust, I see a maximum input of 32,767. With audiounit, the static is worse, and the maximum input level is 32,766.

@mutexlox-signal
Copy link
Contributor Author

mutexlox-signal commented Oct 21, 2024

Following the suggestion in comment22 in the bugzilla thread, I tried:

    stream.set_input_processing_params(
        cubeb_core::InputProcessingParams::ECHO_CANCELLATION
            | cubeb_core::InputProcessingParams::NOISE_SUPPRESSION
            | cubeb_core::InputProcessingParams::AUTOMATIC_GAIN_CONTROL,
    );

This seems to have made the audio as captured by cubeb audible, but System Settings does still show very low levels of input, so I suspect simultaneous use of the microphone by another application would still be affected.

@mutexlox-signal
Copy link
Contributor Author

And in case it helps: Logs without the input processing params and with the rust backend:

$ cargo run
   Compiling test-coreaudio-rs v0.1.0 (/Users/miriam/test-coreaudio-rs)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.12s
     Running `target/debug/test-coreaudio-rs`
using backend audiounit-rust
mod.rs:237: Using the system default device
mod.rs:237: Using the system default device
mod.rs:2857: Use global latency 512 instead of the requested latency 4096.
mod.rs:3274: Input device 98 is on the VPIO force list because it is built in, and its volume is known to be very low without VPIO whenever VPIO is hooked up to it elsewhere.
mod.rs:3292: Evaluating device pair against VPIO block list
mod.rs:3297: Input uid="BuiltInMicrophoneDevice", model_uid="Digital Mic", transport_type="bltn", source="imic", source_name="MacBook Pro Microphone", name="MacBook Pro Microphone", manufacturer="Apple Inc."
mod.rs:3297: Output uid="BlackHole2ch_UID", model_uid="BlackHole2ch_ModelUID", transport_type="virt", source="", source_name="", name="BlackHole 2ch", manufacturer="Existential Audio Inc."
mod.rs:3342: Device pair is not blocked
mod.rs:3372: Input device ID: 98 (aggregate: false)
mod.rs:3386: Output device ID: 74 (aggregate: false)
mod.rs:2417: Creating shared voiceprocessing storage.
mod.rs:2231: Just created shared element #0. Took 1.0011874s.
mod.rs:3554: (0x154f08390) Initializing input by device info: device_info { id: 98, flags: device_flags(DEV_INPUT | DEV_SELECTED_DEFAULT) }
device_property.rs:268: Filtering input streams [99] for device 98. Next device is Some(127).
device_property.rs:282: Input stream filtering for device 98 retained [99].
mod.rs:3274: Input device 98 is on the VPIO force list because it is built in, and its volume is known to be very low without VPIO whenever VPIO is hooked up to it elsewhere.
mod.rs:3572: (0x154f08390) Opening input side: rate 48000, channels 1, format S16LE, layout FRONT_CENTER, prefs (empty), latency in frames 512, voice processing true.
mod.rs:3610: (0x154f08390) Input hardware description: AudioStreamBasicDescription { mSampleRate: 44100.0, mFormatID: 1819304813, mFormatFlags: 9, mBytesPerPacket: 4, mFramesPerPacket: 1, mBytesPerFrame: 4, mChannelsPerFrame: 1, mBitsPerChannel: 32, mRese
mod.rs:1501: The buffer frame size of AudioUnit 0x81d0c07c for INPUT is already 512
mod.rs:3741: (0x154f08390) Input audiounit init with device 98 successfully.
mod.rs:3770: (0x154f08390) Initialize output by device info: device_info { id: 74, flags: device_flags(DEV_OUTPUT | DEV_SELECTED_DEFAULT) }
mod.rs:3776: (0x154f08390) Opening output side: rate 48000, channels 1, format S16LE, layout FRONT_CENTER, prefs (empty), latency in frames 512, voice processing true.
mod.rs:3815: (0x154f08390) Output hardware description: AudioStreamBasicDescription { mSampleRate: 44100.0, mFormatID: 1819304813, mFormatFlags: 41, mBytesPerPacket: 4, mFramesPerPacket: 1, mBytesPerFrame: 4, mChannelsPerFrame: 2, mBitsPerChannel: 32, mRe
mod.rs:3913: (0x154f08390 Using output device channel layout [FrontCenter]
mod.rs:3930: Incompatible channel layouts detected, setting up remixer
mixer.rs:183: Creating a mixer with input channel count: 1, input layout: FRONT_CENTER,out channel count: 2, output channels: [FrontCenter]
mixer.rs:76: Create an integer type(i16) mixer
mod.rs:1501: The buffer frame size of AudioUnit 0x81d0c07c for OUTPUT is already 512
mod.rs:4008: (0x154f08390) Output audiounit init with device 74 successfully.
cubeb_resampler_internal.h:569:Resampling input (44100) and output (44100) to target rate of 48000Hz
device_property.rs:268: Filtering input streams [99] for device 98. Next device is Some(127).
device_property.rs:282: Input stream filtering for device 98 retained [99].
mod.rs:2901: (0x154f08390) Cubeb stream init successful.
mod.rs:405: set_input_processing_params on unit 0x81d0c07c - set agc: 0
mod.rs:446: set_input_processing_params on unit 0x81d0c07c - set bypass: 1
stream Started
mod.rs:4939: Cubeb stream (0x154f08390) started successfully.
mod.rs:834: Dropping 7 frames in input buffer.
buffer_manager.rs:261: Underrun during input data pull: (needed: 441, available: 434)
mod.rs:405: set_input_processing_params on unit 0x81d0c07c - set agc: 1
mod.rs:446: set_input_processing_params on unit 0x81d0c07c - set bypass: 0
stream Stopped
mod.rs:4954: Cubeb stream (0x154f08390) stopped successfully.
mod.rs:2265: Recycling shared element #0. Nr of live elements now 0.
mod.rs:2308: Clearing shared voiceprocessing unit storage in 10s if still at generation 1.
mod.rs:4879: Cubeb stream (0x154f08390) destroyed successful.
mod.rs:2277: Cleared 1 shared element. Took 0.6464373s.

and with the input processing params:

$ cargo run
   Compiling test-coreaudio-rs v0.1.0 (/Users/miriam/test-coreaudio-rs)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
     Running `target/debug/test-coreaudio-rs`
using backend audiounit-rust
mod.rs:237: Using the system default device
mod.rs:237: Using the system default device
mod.rs:2857: Use global latency 512 instead of the requested latency 4096.
mod.rs:3274: Input device 98 is on the VPIO force list because it is built in, and its volume is known to be very low without VPIO whenever VPIO is hooked up to it elsewhere.
mod.rs:3292: Evaluating device pair against VPIO block list
mod.rs:3297: Input uid="BuiltInMicrophoneDevice", model_uid="Digital Mic", transport_type="bltn", source="imic", source_name="MacBook Pro Microphone", name="MacBook Pro Microphone", manufacturer="Apple Inc."
mod.rs:3297: Output uid="BlackHole2ch_UID", model_uid="BlackHole2ch_ModelUID", transport_type="virt", source="", source_name="", name="BlackHole 2ch", manufacturer="Existential Audio Inc."
mod.rs:3342: Device pair is not blocked
mod.rs:3372: Input device ID: 98 (aggregate: false)
mod.rs:3386: Output device ID: 74 (aggregate: false)
mod.rs:2417: Creating shared voiceprocessing storage.
mod.rs:2231: Just created shared element #0. Took 0.9912193s.
mod.rs:3554: (0x137304a80) Initializing input by device info: device_info { id: 98, flags: device_flags(DEV_INPUT | DEV_SELECTED_DEFAULT) }
device_property.rs:268: Filtering input streams [99] for device 98. Next device is Some(127).
device_property.rs:282: Input stream filtering for device 98 retained [99].
mod.rs:3274: Input device 98 is on the VPIO force list because it is built in, and its volume is known to be very low without VPIO whenever VPIO is hooked up to it elsewhere.
mod.rs:3572: (0x137304a80) Opening input side: rate 48000, channels 1, format S16LE, layout FRONT_CENTER, prefs (empty), latency in frames 512, voice processing true.
mod.rs:3610: (0x137304a80) Input hardware description: AudioStreamBasicDescription { mSampleRate: 44100.0, mFormatID: 1819304813, mFormatFlags: 9, mBytesPerPacket: 4, mFramesPerPacket: 1, mBytesPerFrame: 4, mChannelsPerFrame: 1, mBitsPerChannel: 32, mRese
mod.rs:1501: The buffer frame size of AudioUnit 0x81d3107c for INPUT is already 512
mod.rs:3741: (0x137304a80) Input audiounit init with device 98 successfully.
mod.rs:3770: (0x137304a80) Initialize output by device info: device_info { id: 74, flags: device_flags(DEV_OUTPUT | DEV_SELECTED_DEFAULT) }
mod.rs:3776: (0x137304a80) Opening output side: rate 48000, channels 1, format S16LE, layout FRONT_CENTER, prefs (empty), latency in frames 512, voice processing true.
mod.rs:3815: (0x137304a80) Output hardware description: AudioStreamBasicDescription { mSampleRate: 44100.0, mFormatID: 1819304813, mFormatFlags: 41, mBytesPerPacket: 4, mFramesPerPacket: 1, mBytesPerFrame: 4, mChannelsPerFrame: 2, mBitsPerChannel: 32, mRe
mod.rs:3913: (0x137304a80 Using output device channel layout [FrontCenter]
mod.rs:3930: Incompatible channel layouts detected, setting up remixer
mixer.rs:183: Creating a mixer with input channel count: 1, input layout: FRONT_CENTER,out channel count: 2, output channels: [FrontCenter]
mixer.rs:76: Create an integer type(i16) mixer
mod.rs:1501: The buffer frame size of AudioUnit 0x81d3107c for OUTPUT is already 512
mod.rs:4008: (0x137304a80) Output audiounit init with device 74 successfully.
cubeb_resampler_internal.h:569:Resampling input (44100) and output (44100) to target rate of 48000Hz
device_property.rs:268: Filtering input streams [99] for device 98. Next device is Some(127).
device_property.rs:282: Input stream filtering for device 98 retained [99].
mod.rs:2901: (0x137304a80) Cubeb stream init successful.
mod.rs:5130: Cubeb stream (0x137304a80) set input processing params ECHO_CANCELLATION | NOISE_SUPPRESSION | AUTOMATIC_GAIN_CONTROL.
stream Started
mod.rs:4939: Cubeb stream (0x137304a80) started successfully.
mod.rs:834: Dropping 7 frames in input buffer.
buffer_manager.rs:261: Underrun during input data pull: (needed: 441, available: 434)
stream Stopped
mod.rs:4954: Cubeb stream (0x137304a80) stopped successfully.
mod.rs:2265: Recycling shared element #0. Nr of live elements now 0.
mod.rs:2308: Clearing shared voiceprocessing unit storage in 10s if still at generation 1.
mod.rs:4879: Cubeb stream (0x137304a80) destroyed successful.
mod.rs:2277: Cleared 1 shared element. Took 0.6440077s.

and with the older C++ backend:

$ cargo run
   Compiling test-coreaudio-rs v0.1.0 (/Users/miriam/test-coreaudio-rs)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.12s
     Running `target/debug/test-coreaudio-rs`
using backend audiounit
cubeb_audiounit.cpp:1724:New aggregate device 127
cubeb_audiounit.cpp:1751:Add devices input 98 and output 74 into aggregate device 127
cubeb_audiounit.cpp:2399:(0x131f08360) Opening input side: rate 48000, channels 1, format 0, latency in frames 512.
cubeb_audiounit.cpp:2412:(0x131f08360) Input device sampling rate: 48000.00
cubeb_audiounit.cpp:2320:(0x131f08360) No need to update input buffer size already 512 frames
cubeb_audiounit.cpp:2477:(0x131f08360) Input audiounit init successfully.
cubeb_audiounit.cpp:2494:(0x131f08360) Opening output side: rate 48000, channels 1, format 0, latency in frames 512.
cubeb_audiounit.cpp:2521:(0x131f08360) Output device sampling rate: 48000.00
cubeb_audiounit.cpp:2528:Incompatible channel layouts detected, setting up remixer
cubeb_mixer.cpp:124:Treating layout as mono
cubeb_audiounit.cpp:2320:(0x131f08360) No need to update output buffer size already 512 frames
cubeb_audiounit.cpp:2581:(0x131f08360) Output audiounit init successfully.
cubeb_resampler_internal.h:521:Input and output sample-rate match, target rate of 48000Hz
cubeb_audiounit.cpp:2868:(0x131f08360) Cubeb stream init successful.
stream Started
cubeb_audiounit.cpp:2986:Cubeb stream (0x131f08360) started successfully.
cubeb_audiounit.cpp:634:(0x131f08360) output shutdown.
cubeb_audiounit.cpp:634:(0x131f08360) output shutdown.
cubeb_audiounit.cpp:634:(0x131f08360) output shutdown.
stream Stopped
cubeb_audiounit.cpp:3014:Cubeb stream (0x131f08360) stopped successfully.
stream Stopped
cubeb_audiounit.cpp:3014:Cubeb stream (0x131f08360) stopped successfully.
cubeb_audiounit.cpp:2041:Destroyed aggregate device 127
cubeb_audiounit.cpp:2947:Cubeb stream (0x131f08360) destroyed successful.

@mutexlox-signal
Copy link
Contributor Author

Intriguingly, the issue does not reproduce with audiounit (the C++ one) and a manual request of the VOICE StreamPrefs.

@Pehrsons
Copy link
Contributor

As you have found, this is an Apple issue with the VoiceProcessingIO audio unit. If the builtin mic is hooked up to VPIO its volume will be very low in any concurrent non-VPIO uses. Even in other processes. Try simultaneously using the builtin mic in Safari and Chrome or System Settings.
This seems true for at least those pretty modern Macbook Pros with a mic array under the hood.

Because of this we introduced a force-list forcing VPIO for builtin mics even if the VOICE pref isn't passed. With VPIO but without any applied processing params, we put the VPIO unit in bypass mode. You'd imagine then that it behaves like the HAL unit.

It doesn't.

But it seemed to work ok until the M3 MBPs came out. With them and VPIO in bypass we started getting reports of low volume again. Another Apple issue I am still pondering how best to fix. Maybe we should drop the force-list. But then 1 VOICE and 1 non-VOICE stream on the builtin device won't work as expected (Google Meet in Firefox results in this). My best bet so far is to disallow bypass when using VPIO, but that's a number of API and Gecko changes away from materializing (see https://bugzilla.mozilla.org/show_bug.cgi?id=1914046).

If you're developing another client I think your best bet is to always request VOICE and set AEC+NS+AGC processing params. That should result in the least issues while getting Apple's excellent processing.

Or do like the c++ backend and stick with the HAL unit by removing the force-list (you'd have to fork or perhaps we could add another pref) and never pass VOICE.

@mutexlox-signal
Copy link
Contributor Author

Got it, I see! Seems like you've spent many frustrating days on this issue. I appreciate the info, and I'll just set AEC+NS+AGC where available.

@padenot
Copy link
Collaborator

padenot commented Oct 25, 2024

days

more like weeks or months, but yeah.

@mutexlox-signal
Copy link
Contributor Author

Yeah that sounds right actually. 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants