fix: prevent meeting mode from triggering on speech alone

Meeting mode was triggering on any speech (VAD) which caused false
starts when user spoke near the mic without being in a call. Now
requires system audio energy above threshold as primary trigger.
Speech is only used as fallback when combined with ANY system audio
data present, catching low-volume Electron/WebRTC apps.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
michael.borak
2026-02-06 10:36:54 +01:00
parent 76ebe9e40c
commit b911eefe72

View File

@@ -256,19 +256,27 @@ impl AudioProcessor {
// MODE-SPECIFIC TRIGGER LOGIC:
// "voice" -> Trigger if user speaks (VAD)
// "meeting" -> Trigger if system audio energy detected OR speech detected.
// Some apps (e.g. Electron-based Nextcloud Talk) may route audio
// in ways that ScreenCaptureKit doesn't always capture immediately.
// Allowing speech as a fallback trigger ensures calls are recorded.
// "meeting" -> Trigger ONLY if system audio energy detected.
// Speech alone is NOT enough (prevents false triggers when
// user talks near the mic without being in a call).
// However, speech + ANY system audio (even below threshold)
// counts as a trigger, to catch Electron apps (e.g. Nextcloud Talk)
// that may produce low-level system audio.
let sys_queue_has_data = if let Ok(q) = self.system_queue.lock() { !q.is_empty() } else { false };
let system_has_any_audio = max_system_energy > 0.0001 || sys_queue_has_data;
let trigger = if self.recording_mode == "voice" {
self.is_speech_active
} else {
system_active || self.is_speech_active
// Primary: system audio above threshold
// Fallback: speech detected AND system audio stream has ANY data
// (catches low-volume call audio from Electron/WebRTC apps)
system_active || (self.is_speech_active && system_has_any_audio)
};
if trigger {
// Trigger Detected!
println!("Auto-Start: Call detected (SysEnergy: {}, Speech: {}). Flushing pre-roll...", max_system_energy, self.is_speech_active);
println!("Auto-Start: Call detected (SysEnergy: {}, Speech: {}, SysHasData: {}). Flushing pre-roll...", max_system_energy, self.is_speech_active, system_has_any_audio);
self.waiting_for_speech = false;
// Flush Ring Buffer (Orderly: from ring_pos to end, then 0 to ring_pos)