fix: prevent meeting mode from triggering on speech alone

Meeting mode was triggering on any speech (VAD) which caused false starts when user spoke near the mic without being in a call. Now requires system audio energy above threshold as primary trigger. Speech is only used as fallback when combined with ANY system audio data present, catching low-volume Electron/WebRTC apps. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 10:36:54 +01:00
parent 76ebe9e40c
commit b911eefe72
1 changed files with 14 additions and 6 deletions
--- a/src-tauri/src/audio_processor.rs
+++ b/src-tauri/src/audio_processor.rs
@@ -256,19 +256,27 @@ impl AudioProcessor {

            // MODE-SPECIFIC TRIGGER LOGIC:
            // "voice"   -> Trigger if user speaks (VAD)
-            // "meeting" -> Trigger if system audio energy detected OR speech detected.
-            //              Some apps (e.g. Electron-based Nextcloud Talk) may route audio
-            //              in ways that ScreenCaptureKit doesn't always capture immediately.
-            //              Allowing speech as a fallback trigger ensures calls are recorded.
+            // "meeting" -> Trigger ONLY if system audio energy detected.
+            //              Speech alone is NOT enough (prevents false triggers when
+            //              user talks near the mic without being in a call).
+            //              However, speech + ANY system audio (even below threshold)
+            //              counts as a trigger, to catch Electron apps (e.g. Nextcloud Talk)
+            //              that may produce low-level system audio.
+            let sys_queue_has_data = if let Ok(q) = self.system_queue.lock() { !q.is_empty() } else { false };
+            let system_has_any_audio = max_system_energy > 0.0001 || sys_queue_has_data;
+
            let trigger = if self.recording_mode == "voice" {
                self.is_speech_active
            } else {
-                system_active || self.is_speech_active
+                // Primary: system audio above threshold
+                // Fallback: speech detected AND system audio stream has ANY data
+                //           (catches low-volume call audio from Electron/WebRTC apps)
+                system_active || (self.is_speech_active && system_has_any_audio)
            };

            if trigger {
                // Trigger Detected!
-                println!("Auto-Start: Call detected (SysEnergy: {}, Speech: {}). Flushing pre-roll...", max_system_energy, self.is_speech_active);
+                println!("Auto-Start: Call detected (SysEnergy: {}, Speech: {}, SysHasData: {}). Flushing pre-roll...", max_system_energy, self.is_speech_active, system_has_any_audio);
                self.waiting_for_speech = false;

                // Flush Ring Buffer (Orderly: from ring_pos to end, then 0 to ring_pos)