Inconsistent playback of mic push/release sounds. #2912

Open
opened 2026-02-20 22:14:48 -05:00 by deekerman · 9 comments
Owner

Originally created by @bmmcginty on GitHub (Oct 21, 2024).

Description

I have sounds enabled for key down and key up of my mic. If you press and release the PTT key, waiting even as much as 1 second between press/release and release/re-press, the key up/key down sound will eventually "miss a beat", or fail to sound. This can happen multiple times consecutively, and can take up to 30 seconds to show up. It can also happen when keying or unkeying, which can make it difficult to know when I'm transmitting for certain.
I amusing the default mic up/down sounds that are provided with the client (on.ogg and off.ogg).
I am using the PTT lock setting, threshold 300 MS. This issue occurs regardless of that lock setting, though, and i have tested same. Possibly related to #6359.

Steps to reproduce

Press the PTT key, hold for 1/4 second, and release. Wait 1/4 second, and repeat.

Mumble version

1.5.634

Mumble component

Client

OS

Windows

Reproducible?

Yes

Additional information

No response

Relevant log output

No response

Screenshots

No response

Originally created by @bmmcginty on GitHub (Oct 21, 2024). ### Description I have sounds enabled for key down and key up of my mic. If you press and release the PTT key, waiting even as much as 1 second between press/release and release/re-press, the key up/key down sound will eventually "miss a beat", or fail to sound. This can happen multiple times consecutively, and can take up to 30 seconds to show up. It can also happen when keying or unkeying, which can make it difficult to know when I'm transmitting for certain. I amusing the default mic up/down sounds that are provided with the client (on.ogg and off.ogg). I am using the PTT lock setting, threshold 300 MS. This issue occurs regardless of that lock setting, though, and i have tested same. Possibly related to #6359. ### Steps to reproduce Press the PTT key, hold for 1/4 second, and release. Wait 1/4 second, and repeat. ### Mumble version 1.5.634 ### Mumble component Client ### OS Windows ### Reproducible? Yes ### Additional information _No response_ ### Relevant log output _No response_ ### Screenshots _No response_
Author
Owner

@Hartmnt commented on GitHub (Oct 21, 2024):

This sounds exactly like #6359 Are your sound cue files .wav files? If so, try to convert it to another format e.g. .ogg

@Hartmnt commented on GitHub (Oct 21, 2024): This sounds exactly like #6359 Are your sound cue files ``.wav`` files? If so, try to convert it to another format e.g. ``.ogg``
Author
Owner

@bmmcginty commented on GitHub (Oct 21, 2024):

These _are .ogg files; they are the default files provided upon install.

@bmmcginty commented on GitHub (Oct 21, 2024): These _are .ogg files; they are the default files provided upon install.
Author
Owner

@Hartmnt commented on GitHub (Oct 21, 2024):

These _are .ogg files; they are the default files provided upon install.

Oof, well then solving this is going to be difficult, because I can not reproduce this with the default sounds.

@Hartmnt commented on GitHub (Oct 21, 2024): > These _are .ogg files; they are the default files provided upon install. Oof, well then solving this is going to be difficult, because I can not reproduce this with the default sounds.
Author
Owner

@tspivey commented on GitHub (Oct 23, 2024):

I also have this issue.

I might've found it. I'm not too familiar with QT or C++, so could be missing something.
Starting from AudioInput::encodeAudioFrame:

  1. When the cue stops, its buffer is invalidated and freed in AudioOutput::handleInvalidatedBuffer.
  2. The next time the audio cue is played, AudioInput::encodeAudioFrame calls ao->removeToken(m_activeAudioCue);.
  3. That function calls AudioOutput::removeBuffer on the buffer, which emits bufferInvalidated.
  4. The new sample is played in AudioInput::encodeAudioFrame and added to the playback queue.
  5. The code responsible for bufferInvalidated runs, and if the new sample gets the address of the old one, removes it.
@tspivey commented on GitHub (Oct 23, 2024): I also have this issue. I might've found it. I'm not too familiar with QT or C++, so could be missing something. Starting from `AudioInput::encodeAudioFrame`: 1. When the cue stops, its buffer is invalidated and freed in `AudioOutput::handleInvalidatedBuffer`. 2. The next time the audio cue is played, `AudioInput::encodeAudioFrame` calls `ao->removeToken(m_activeAudioCue);`. 3. That function calls `AudioOutput::removeBuffer` on the buffer, which emits `bufferInvalidated`. 4. The new sample is played in `AudioInput::encodeAudioFrame` and added to the playback queue. 5. The code responsible for `bufferInvalidated` runs, and if the new sample gets the address of the old one, removes it.
Author
Owner

@Hartmnt commented on GitHub (Oct 23, 2024):

5. The code responsible for bufferInvalidated runs, and if the new sample gets the address of the old one, removes it.

Oh good lord... So we have two handleInvalidateBuffercalls with the same address racing against the system specific memory allocator which is re-allocating the address after the first but before the second call to handleInvalidateBuffer? That sounds possible but extremely hard to trigger. Especially since OP stated you could have a delay of up to a second and still trigger the problem sometimes. I would imagine the queued calles to handleInvalidateBuffer would finish relatively quickly.

But this is at least a possible lead.

@Hartmnt commented on GitHub (Oct 23, 2024): > 5\. The code responsible for `bufferInvalidated` runs, and if the new sample gets the address of the old one, removes it. Oh good lord... So we have two ``handleInvalidateBuffer``calls with the same address racing against the system specific memory allocator which is re-allocating the address after the first but before the second call to ``handleInvalidateBuffer``? That sounds _possible_ but extremely hard to trigger. Especially since OP stated you could have a delay of up to a second and still trigger the problem sometimes. I would _imagine_ the queued calles to ``handleInvalidateBuffer`` would finish relatively quickly. But this is at least a possible lead.
Author
Owner

@tspivey commented on GitHub (Oct 23, 2024):

This isn't hard to trigger at all. I added some debugging and got it on the first try.

<W>2024-10-23 10:17:01.284 Playing :/on.ogg, ptr 0x25c853fdaa0
<W>2024-10-23 10:17:01.383 Invalidated, erasing 0x25c853fdaa0
<W>2024-10-23 10:17:02.023 Removing token, m_buffer=0x25c853fdaa0
<W>2024-10-23 10:17:02.024 Playing :/off.ogg, ptr 0x25c853fdaa0
<W>2024-10-23 10:17:02.026 Invalidated, erasing 0x25c853fdaa0

And this one after a few tries (the first line is when on.ogg fininshes playing):

<W>2024-10-23 10:23:22.823 Invalidated, erasing 0x1a04d9af7d0
<W>2024-10-23 10:23:24.353 Removing token, m_buffer=0x1a04d9af7d0
<W>2024-10-23 10:23:24.354 Playing :/off.ogg, ptr 0x1a04d9af7d0
<W>2024-10-23 10:23:24.356 Invalidated, erasing 0x1a04d9af7d0

Looking at the addresses and timestamps, this seems to be what's happening.

@tspivey commented on GitHub (Oct 23, 2024): This isn't hard to trigger at all. I added some debugging and got it on the first try. ``` <W>2024-10-23 10:17:01.284 Playing :/on.ogg, ptr 0x25c853fdaa0 <W>2024-10-23 10:17:01.383 Invalidated, erasing 0x25c853fdaa0 <W>2024-10-23 10:17:02.023 Removing token, m_buffer=0x25c853fdaa0 <W>2024-10-23 10:17:02.024 Playing :/off.ogg, ptr 0x25c853fdaa0 <W>2024-10-23 10:17:02.026 Invalidated, erasing 0x25c853fdaa0 ``` And this one after a few tries (the first line is when on.ogg fininshes playing): ``` <W>2024-10-23 10:23:22.823 Invalidated, erasing 0x1a04d9af7d0 <W>2024-10-23 10:23:24.353 Removing token, m_buffer=0x1a04d9af7d0 <W>2024-10-23 10:23:24.354 Playing :/off.ogg, ptr 0x1a04d9af7d0 <W>2024-10-23 10:23:24.356 Invalidated, erasing 0x1a04d9af7d0 ``` Looking at the addresses and timestamps, this seems to be what's happening.
Author
Owner

@Hartmnt commented on GitHub (Oct 23, 2024):

@tspivey Nice work confirming this!

Indeed, it makes sense now. The time between the calls is irrelevant because we keep the pointer in AudioInput::m_activeAudioCue. This also perfectly explains, why I can not reproduce this: It entirely depends on the system memory allocator reusing the same address in consecutive calls and Linux (or the specific std implementation I am using) seems to be unaffected by this.

So now we just have to figure out how to fix this 🥲

@Hartmnt commented on GitHub (Oct 23, 2024): @tspivey Nice work confirming this! Indeed, it makes sense now. The time between the calls is irrelevant because we keep the pointer in ``AudioInput::m_activeAudioCue``. This also perfectly explains, why I can not reproduce this: It entirely depends on the system memory allocator reusing the same address in consecutive calls and Linux (or the specific std implementation I am using) seems to be unaffected by this. So now we just have to figure out how to fix this :smiling_face_with_tear:
Author
Owner

@Hartmnt commented on GitHub (Oct 24, 2024):

@tspivey I have come up with something that should reduce the number of times this problem occurs: github.com/hartmnt/mumble@c460e8ca4f

However, this will still race against a sample that is just finishing playing after the check was performed. So I don't consider this a proper fix for this problem. But I wanted to share this anyway for now.

@Hartmnt commented on GitHub (Oct 24, 2024): @tspivey I have come up with something that should reduce the number of times this problem occurs: https://github.com/hartmnt/mumble/commit/c460e8ca4fb6425fdc128d76b4a4080759a26dd4 However, this will still race against a sample that is just finishing playing after the check was performed. So I don't consider this a proper fix for this problem. But I wanted to share this anyway for now.
Author
Owner

@Krzmbrzl commented on GitHub (Jan 12, 2025):

So now we just have to figure out how to fix this

Conceptually, the fix is simple: Instead of storing a raw pointer in m_activeCue, we store a std::weak_ptr<AudioOutputBuffer>. In order for this to be possible, qmOutputs must also the buffers as std::shared_ptr<AudioOutputBuffer>. Then, m_activeCue could check whether the referenced object has been deleted already or not (that's what weak_ptr allows you to do). Insert this check into the current logic of when to stop audio buffers and you're mostly good.
The final puzzle piece is that the invalidateBuffer signal must also pass weak_ptrs instead of raw pointers so handleInvalidatedBuffer can do the same kind of checking.

The problem is: This would require introducing smart pointers into the audio processing code. While I would consider this to be a good thing in general, this will likely require a non-trivial refactoring effort. And all that because of the audio cue handling (afaict no other part of the code ever takes and stores a pointer to an audio output buffer). That seems excessive.
On the other hand, I can't come up with another mechanism that would allow us to get to know whether a given sample has finished playing or not. My first thought was to emit an event, but this would again be asynchronous, so we still have a race condition as stopping the supposedly still running sample and the event telling you it already stopped are in no way synchronized and can therefore happen in any order. 🤔

@Krzmbrzl commented on GitHub (Jan 12, 2025): > So now we just have to figure out how to fix this Conceptually, the fix is simple: Instead of storing a raw pointer in `m_activeCue`, we store a `std::weak_ptr<AudioOutputBuffer>`. In order for this to be possible, `qmOutputs` must also the buffers as `std::shared_ptr<AudioOutputBuffer>`. Then, `m_activeCue` could check whether the referenced object has been deleted already or not (that's what `weak_ptr` allows you to do). Insert this check into the current logic of when to stop audio buffers and you're mostly good. The final puzzle piece is that the `invalidateBuffer` signal must also pass `weak_ptr`s instead of raw pointers so `handleInvalidatedBuffer` can do the same kind of checking. The problem is: This would require introducing smart pointers into the audio processing code. While I would consider this to be a good thing in general, this will likely require a non-trivial refactoring effort. And all that because of the audio cue handling (afaict no other part of the code ever takes and stores a pointer to an audio output buffer). That seems excessive. On the other hand, I can't come up with another mechanism that would allow us to get to know whether a given sample has finished playing or not. My first thought was to emit an event, but this would again be asynchronous, so we still have a race condition as stopping the supposedly still running sample and the event telling you it already stopped are in no way synchronized and can therefore happen in any order. :thinking:
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/mumble-mumble-voip#2912
No description provided.