7 Commits

Author SHA1 Message Date
michael.borak
69dc6b8fac chore: bump version to 1.2.1 for release 2026-01-24 14:33:35 +01:00
michael.borak
291f4950e8 docs: add 3CX to supported applications list 2026-01-24 14:29:00 +01:00
michael.borak
9a65f42f51 feat: refine meeting auto-start, silence timeout (25s) and improve transcription logging 2026-01-24 14:16:55 +01:00
michael.borak
a3e4fa4ec7 Release 1.2.0: Remove backup encryption and switch to JSON-only backups with history support 2026-01-24 13:10:18 +01:00
michael.borak
897f2ec0c2 fix(recorder): Resolve infinite loop stale closure & reset status text on discard 2026-01-24 01:47:47 +01:00
michael.borak
e24b448c6c docs: Update README for v1.2.0 (Native Audio & Auto-Loop) 2026-01-24 01:40:11 +01:00
michael.borak
4e9a1fd038 feat(v1.2.0): Final Release - Native Audio, Smart VAD, Auto-Loop & Quality Fixes
- Implemented standard 48kHz audio pipeline to fix sample rate mismatch/distortion
- Added Native System Audio (ScreenCaptureKit) support
- Implemented Smart VAD (Voice Activity Detection) with Auto-Start on valid audio
- Added Auto-Loop: Automatically re-arms recording after stop
- Added Empty Guard: Prevents transcribing silent recordings (< 20s empty)
- Increased Pre-Roll buffer to 3.0s to prevent cut-off speech
- Fixed clipping with clamped audio mixing
2026-01-24 01:35:09 +01:00
13 changed files with 834 additions and 417 deletions

185
README.md
View File

@@ -1,165 +1,89 @@
# Hearbit AI 🦉🎙️
**Hearbit AI** is your professional meeting assistant for macOS. It records both your microphone and system audio (e.g., Teams, Zoom), transcribes it with high precision using Infomaniak's Whisper API, and generates intelligent, structured summaries.
**Hearbit AI** is your professional meeting assistant for macOS. It records both your microphone and system audio (e.g., Teams, Zoom, 3CX, Talk), transcribes it with high precision using Infomaniak's Whisper API, and generates intelligent, structured summaries.
![App Icon](src-tauri/icons/128x128@2x.png)
## ✨ Features
## ✨ New in v1.2.0
* **🎙️ Dual-Channel Recording**: seamlessly capture your voice and meeting audio from apps like Microsoft Teams, Zoom, or Google Meet.
* **📁 Import Audio Files**: Upload existing recordings (MP3, MP4, WAV, M4A, FLAC, OGG, AAC, WMA) for transcription and summarization.
* **⏱️ Long Meeting Support**: Record meetings up to 2+ hours with automatic MP3 conversion and chunking.
* **🎵 Smart Auto-Stop**:
* **Universal Auto-Stop**: Automatically stops recording after **20 seconds of silence** in ALL modes (Voice Memo & Meeting).
* **Noise Filtering**: Enhanced VAD (Voice Activity Detection) ignores background noise and keyboard typing, only triggering on clear speech.
* **📅 Microsoft 365 Integration**:
* **Upcoming Meetings**: View your daily schedule and join with **one click**.
* **Meeting Details**: View full agenda and **invited attendee status** (Accepted/Declined).
* **💾 Persistent History**: Automatically saves all transcripts and summaries to disk. Search and review past meetings anytime.
* **✉️ Email Summaries**: Send professional, formatted HTML summaries (with preview) directly to attendees via your own SMTP server.
* **🧠 Powered by Infomaniak AI**:
* **Precision Transcription**: Standard-compliant formatting with **second-by-second timestamps**.
* **Smart Summaries**: Uses **Smart Templates** to automatically select the best format (Business Protocol vs. 1:1) based on meeting content.
* **🔇 Smart VAD**: Automatically filters out silence and stops recording when you stop talking.
* **🎨 White-Labeling**: Upload your **custom company logo** in Settings to brand the application.
* **🔒 Privacy-First**: Data is processed securely via your own Infomaniak API keys.
* **🎧 Native System Audio**: No more BlackHole driver needed! Captures Teams, Zoom, 3CX, and Talk audio directly and securely via macOS ScreenCaptureKit.
* **🔁 Auto-Loop (Standby Mode)**: The app automatically "re-arms" after a call finishes. Just leave it running—it will wake up, record your next call, and go back to sleep.
* **⚡ Smart VAD & Pre-Roll**:
* **3-Second Pre-Roll**: Catches the start of the sentence even if you speak before the trigger.
* **Noise Filtering**: Ignores typing and background noise.
* **🛡️ Empty Audio Guard**: Automatically discards silent recordings (e.g., false triggers) to save API costs and prevent errors.
* **✨ 48kHz Crystal Clear Audio**: Optimized audio pipeline prevents "robot voice" distortion.
* **💾 Daily Security Backups**: Automatically saves your entire history as a standard JSON file every 24 hours (unencrypted for easy recovery).
---
## 🚀 Key Features
## 🚀 Getting Started
### Required
* **macOS** (tested on macOS Monterey and later)
* **BlackHole 2ch Driver** ([Download here](https://existential.audio/blackhole/))
* **MANDATORY** for system audio capture (MS Teams, Zoom, etc.)
* Without this, you can only record microphone input
* **ffmpeg** for audio processing
```bash
brew install ffmpeg
```
* **Infomaniak AI Account**: You need an API Key and Product ID from the [Infomaniak Developer Portal](https://manager.infomaniak.com/).
### 2. Installation
1. Download the latest `.dmg` file from the [Releases page](#).
2. Open the `.dmg` and drag **Hearbit AI** to your Applications folder.
3. Launch the app.
---
## 🎧 Recording System Audio (Teams, Zoom, etc.)
We've made this easy! **Note: You must have the BlackHole driver installed.**
1. **Create "Hearbit Audio" Device**:
* Open the app and select **Meeting** mode.
* If you don't have the device yet, click the **"🪄 Create Hearbit Audio Device"** button.
* This creates a specialized "Multi-Output Device" that routes audio to both your headphones/speakers AND the app.
2. **Configure Teams / Zoom / Webex**:
* **Speaker / Output**: Change this to **Hearbit Audio**.
* *Why?* This ensures the audio goes to the recording app *and* your ears.
* **Microphone / Input**: Leave this as your normal microphone (e.g., MacBook Pro Mic).
* *Note:* Do **not** select Hearbit Audio as your microphone in Teams.
3. **Start Recording**:
* In Hearbit AI, ensure **Hearbit Audio** is selected as the input.
* **🎙️ Dual-Channel Recording**: Seamlessly capture your voice and meeting audio.
* **📁 Import Audio Files**: Upload existing recordings (MP3, WAV, M4A, etc.).
* **⏱️ Long Meeting Support**: Handles meetings 2+ hours with automatic chunking.
* **📅 Microsoft 365 Integration**: View upcoming meetings and join with one click.
* **💾 Persistent History**: Automatically saves all transcripts and summaries locally.
* **✉️ Email Summaries**: Send formatted HTML summaries via your own SMTP server.
* **🎨 White-Labeling**: Upload your custom company logo.
---
## 🛠️ Usage Guide
1. **Configuration**:
* Click the **Settings** (gear icon).
* Enter your **Infomaniak API Key** and **Product ID**.
### 1. Installation
1. Download the latest `.dmg` file from the [Releases page](#).
2. Open the `.dmg` and drag **Hearbit AI** to your Applications folder.
3. **Permission Check**: On first launch, grant "Screen Recording" permission (required for capturing System Audio).
2. **Connect M365 (Optional)**:
* Copy the **Application (client) ID**.
* Click the **Meetings** tab.
* Enter your **Client ID** and click "Connect".
* Proceed with MS login.
* View your upcoming meetings.
### 2. Configuration
1. Click **Settings** (gear icon).
2. Enter your **Infomaniak API Key** and **Product ID**.
3. (Optional) Configure **SMTP** for email sending and **Microsoft 365** for calendar integration.
3. **Recording**:
* Choose your **Template** (e.g., "Meeting Protocol").
* Select your **Input Device**.
* Click **Start Recording**.
### 3. Recording a Meeting
1. **Select Mode**: Choose "Meeting" (captures Mic + System) or "Voice Memo" (Mic only).
2. **Auto-Start Logic**:
- **Meeting Mode**: Triggers only when the call actually starts (system audio detected).
- **Voice Memo**: Triggers immediately when you start speaking.
3. **Standby**: Click "Standby (Auto-Start)". The app waits silently.
4. **Join Call**: Join your Teams/Zoom call.
5. **Trigger**: Hearbit starts recording automatically based on the selected mode.
6. **Finish**: When the call ends (silence > 25s), Hearbit stops, transcribes, summarizes, and **goes back to Standby** for the next call.
4. **Processing**:
* Click **Stop** when finished.
* The app will transcribe the audio (with timestamps!) and generate a summary based on your selected template.
* You will be automatically taken to the **Transcription** tab to review the results.
### 4. Optimal Setup (MS Teams / Zoom / 3CX)
For the best experience without changing any software settings:
* **Hearbit App**: Select your **real microphone** (e.g., "MacBook Mic" or Headset).
* **Meeting Software**: Use your standard output (Speakers/Headset).
* *How it works*: Hearbit captures your voice via mic and the other side via macOS System Audio Capture automatically.
---
*Note: If you choose "Hearbit Audio" (Aggregate Device) in the app, you MUST set your Teams/Zoom/3CX speaker output to "Hearbit Audio" as well.*
## 🎨 Custom Branding (White-Labeling)
You can replace the default Livtec logo with your own company branding:
1. Go to **Settings** (gear icon) → **Branding**.
2. Click **Upload Logo**.
3. Select your file (PNG, JPG, SVG).
4. The content changes immediately across the app.
5. *Tip*: Use a transparent PNG for best results.
---
## 📧 Advanced Email Templates
The email system supports **full HTML & JavaScript** templates. This allows for dynamic dashboards, charts, and interactive reports.
**How to use:**
1. Go to **Settings** → **Email**.
2. Create a new template.
3. Use `{{summary}}` as a placeholder for the raw AI JSON output.
4. In your HTML/Script, parse it:
```javascript
const reportData = {{summary}};
// Now you can use reportData.todos, reportData.updates, etc.
```
5. Use `{{date}}` for the current date and `{{subject}}` for the meeting title.
*Example*: Create a "Daily Standup Dashboard" that visualizes Blocker/Updates/Todos in a grid layout.
### 5. Customizing Prompts
You can create custom AI templates in Settings -> Prompts. Example:
* **"Sales Call"**: Focus on budget, timeline, and decision makers.
* **"Daily Standup"**: Extract blockers and next steps.
* **"General Protocol"**: Standard meeting minutes.
---
## ❓ Troubleshooting
### "Hearbit AI is damaged and can't be opened"
This is a standard macOS warning for apps not signed with an Apple Developer Certificate. To fix it:
If macOS blocks the app because it's not notarized:
1. Open **Terminal**.
2. Run the following command:
```bash
sudo xattr -cr /Applications/Hearbit\ AI.app
```
3. Enter your password.
4. Open the app again.
2. Run: `sudo xattr -cr /Applications/Hearbit\ AI.app`
3. Enter your password and try again.
### Long Meetings (> 1 hour)
### Audio cuts off at the start?
v1.2.0 includes a **3-second buffer**. The Meeting mode now uses a more sensitive trigger (0.005 energy) to catch even quiet participants.
**Automatic Handling**: The app automatically handles long recordings:
- **MP3 Conversion**: All recordings are converted to MP3 (64kbps) for 10x compression
- **Chunking**: Files ≥18 MB are automatically split into 10-minute segments
- **Processing**: Each segment is transcribed separately and merged with timestamps
**Example**: A 2-hour meeting:
1. Records as WAV (~120 MB)
2. Converts to MP3 (~12 MB)
3. Stays under limit → No chunking needed!
**Very long meetings** (e.g., all-day workshops):
- Automatically chunks into segments
- Shows progress: "Processing chunk 1/15..."
- Merges all transcriptions seamlessly
### No Audio / Can't Hear Meeting Participants
### "Batch processing failed"
This means the audio was empty or too short. Check the **Logs** tab for detailed error messages from Infomaniak. The most common cause is selecting the wrong input device or a lack of Screen Recording permissions.
---
## 👨‍💻 Development
Built with **Tauri**, **React**, and **TypeScript**.
Built with **Tauri v2**, **React**, and **TypeScript**.
### Setup
```bash
@@ -175,7 +99,6 @@ npm run tauri dev
```bash
npm run tauri build
```
*The build artifact will be located in `src-tauri/target/release/bundle/dmg/*`*
---

22
RELEASE_NOTES_1.2.0.md Normal file
View File

@@ -0,0 +1,22 @@
# Release Notes - Hearbit AI v1.2.0
## 🚀 Neuheiten
### Native System Audio (ScreenCaptureKit)
Wir haben die Audio-Engine komplett erneuert!
- **Keine Treiber mehr:** Sie müssen BlackHole nicht mehr installieren.
- **Funktioniert überall:** Egal ob Teams, Zoom, Webex, Nextcloud Talk oder 3CX die App hört jetzt nativ mit.
- **Berechtigung:** Die App fragt beim ersten Start nach der "Bildschirmaufnahme"-Berechtigung. Dies ist der moderne Apple-Standard für Audio-Capture.
### Smart VAD (Intelligente Spracherkennung)
- **Ignoriert Musik:** Die App unterscheidet jetzt präzise zwischen menschlicher Sprache und Musik.
- **Wartebereich-Filter:** Musik im Teams-Wartebereich wird nicht mehr aufgenommen. Die Aufnahme startet erst, wenn wirklich gesprochen wird.
### UI Verbesserungen
- **Neuer Setup-Flow:** Das komplizierte Audio-Setup wurde entfernt.
- **Freie Wahl:** Nutzen Sie jedes Mikrofon, das Sie möchten.
## 🛠️ Technische Änderungen
- Update auf `screencapturekit` Framework (macOS 12.3+ erforderlich).
- BlackHole-Abhängigkeit entfernt.
- Audio-Mixing direkt in der App.

4
package-lock.json generated
View File

@@ -1,12 +1,12 @@
{
"name": "hearbit-ai",
"version": "0.1.0",
"version": "1.1.1",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "hearbit-ai",
"version": "0.1.0",
"version": "1.1.1",
"dependencies": {
"@tailwindcss/postcss": "^4.1.18",
"@tauri-apps/api": "^2",

View File

@@ -1,7 +1,7 @@
{
"name": "hearbit-ai",
"private": true,
"version": "1.1.1",
"version": "1.2.1",
"type": "module",
"scripts": {
"dev": "vite",

81
src-tauri/Cargo.lock generated
View File

@@ -347,6 +347,12 @@ dependencies = [
"wyz",
]
[[package]]
name = "block"
version = "0.1.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0d8c1fef690941d3e7788d328517591fecc684c084084702d6ff1641e993699a"
[[package]]
name = "block-buffer"
version = "0.10.4"
@@ -1739,7 +1745,7 @@ checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100"
[[package]]
name = "hearbit-ai"
version = "0.1.2"
version = "1.2.1"
dependencies = [
"base64 0.22.1",
"chrono",
@@ -1749,6 +1755,8 @@ dependencies = [
"oauth2",
"reqwest 0.11.27",
"rubato",
"screencapturekit",
"screencapturekit-sys",
"serde",
"serde_json",
"tauri",
@@ -2425,6 +2433,15 @@ dependencies = [
"libc",
]
[[package]]
name = "malloc_buf"
version = "0.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "62bb907fe88d54d8d9ce32a3cceab4218ed2f6b7d35617cafe9adf84e43919cb"
dependencies = [
"libc",
]
[[package]]
name = "markup5ever"
version = "0.14.1"
@@ -2717,6 +2734,27 @@ dependencies = [
"url",
]
[[package]]
name = "objc"
version = "0.2.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "915b1b472bc21c53464d6c8461c9d3af805ba1ef837e1cac254428f4a77177b1"
dependencies = [
"malloc_buf",
"objc_exception",
]
[[package]]
name = "objc-foundation"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1add1b659e36c9607c7aab864a76c7a4c2760cd0cd2e120f3fb8b952c7e22bf9"
dependencies = [
"block",
"objc",
"objc_id",
]
[[package]]
name = "objc2"
version = "0.6.3"
@@ -2979,6 +3017,24 @@ dependencies = [
"objc2-security",
]
[[package]]
name = "objc_exception"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ad970fb455818ad6cba4c122ad012fae53ae8b4795f86378bce65e4f6bab2ca4"
dependencies = [
"cc",
]
[[package]]
name = "objc_id"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c92d4ddb4bd7b50d730c215ff871754d0da6b2178849f8a2a2ab69712d0c073b"
dependencies = [
"objc",
]
[[package]]
name = "object"
version = "0.32.2"
@@ -4114,6 +4170,29 @@ version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94143f37725109f92c262ed2cf5e59bce7498c01bcc1502d7b9afe439a4e9f49"
[[package]]
name = "screencapturekit"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1a5eeeb57ac94960cfe5ff4c402be6585ae4c8d29a2cf41b276048c2e849d64e"
dependencies = [
"screencapturekit-sys",
]
[[package]]
name = "screencapturekit-sys"
version = "0.2.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "22411b57f7d49e7fe08025198813ee6fd65e1ee5eff4ebc7880c12c82bde4c60"
dependencies = [
"block",
"dispatch",
"objc",
"objc-foundation",
"objc_id",
"once_cell",
]
[[package]]
name = "sct"
version = "0.7.1"

View File

@@ -1,6 +1,6 @@
[package]
name = "hearbit-ai"
version = "0.1.2"
version = "1.2.1"
description = "A Tauri App"
authors = ["you"]
edition = "2021"
@@ -38,3 +38,5 @@ lettre = { version = "0.11", features = ["tokio1", "tokio1-native-tls", "builder
tauri-plugin-log = "2.0.0"
tauri-plugin-shell = "2.3.4"
base64 = "0.22"
screencapturekit = "0.2.0"
screencapturekit-sys = "0.2.8"

View File

@@ -1,5 +1,6 @@
use std::sync::{Arc, Mutex};
use tauri::{AppHandle, Emitter};
use crate::emit_log;
use cpal::Sample;
use hound::WavWriter;
use rubato::{Resampler, FastFixedIn, PolynomialDegree};
@@ -39,6 +40,12 @@ pub struct AudioProcessor {
// Event Emission
app_handle: Option<AppHandle>,
last_event_time: std::time::Instant,
// System Audio Queue for Mixing
pub system_queue: Arc<Mutex<std::collections::VecDeque<f32>>>,
// Recording Mode (voice or meeting)
recording_mode: String,
}
impl AudioProcessor {
@@ -47,7 +54,8 @@ impl AudioProcessor {
channel_count: u16,
writer: Arc<Mutex<WavWriter<std::io::BufWriter<std::fs::File>>>>,
app_handle: AppHandle,
wait_for_speech: bool
wait_for_speech: bool,
recording_mode: String,
) -> Result<Self, String> {
let vad_sample_rate = 16000;
let vad_chunk_size = 512;
@@ -68,8 +76,8 @@ impl AudioProcessor {
1
).map_err(|e| format!("Failed to init Resampler: {:?}", e))?;
// Pre-roll buffer (1.0 seconds) * Channels (interleaved store)
let ring_curr_seconds = 1.0;
// Pre-roll buffer (3.0 seconds) * Channels (interleaved store)
let ring_curr_seconds = 3.0;
// WavWriter writes interleaved, so we store interleaved.
let ring_size = (sample_rate as f32 * ring_curr_seconds) as usize * channel_count as usize;
@@ -96,10 +104,56 @@ impl AudioProcessor {
total_processed_samples: 0,
app_handle: Some(app_handle),
last_event_time: std::time::Instant::now(),
system_queue: Arc::new(Mutex::new(std::collections::VecDeque::new())),
recording_mode,
})
}
pub fn process(&mut self, data: &[f32]) {
pub fn process(&mut self, input_data: &[f32]) {
// MIXING LOGIC:
// We have `input_data` (Microphone). We check `system_queue` for System Audio.
// System Audio is hardcoded to 2 channels (Stereo) in sc_audio.rs.
// Microphone `self.channel_count` can be 1 (Mono) or 2 (Stereo).
let mic_channels = self.channel_count as usize;
let mut mixed_data = input_data.to_vec();
let mut max_system_energy = 0.0;
let gain_mic = 1.0;
let gain_sys = 0.8; // Slightly lower system audio to prioritize speaker
if let Ok(mut queue) = self.system_queue.lock() {
let frames = mixed_data.len() / mic_channels;
for f in 0..frames {
// system_queue is always stereo (L, R, L, R...)
if let (Some(l), Some(r)) = (queue.pop_front(), queue.pop_front()) {
let abs_l = l.abs();
let abs_r = r.abs();
let current_sys_max = if abs_l > abs_r { abs_l } else { abs_r };
if current_sys_max > max_system_energy {
max_system_energy = current_sys_max;
}
if mic_channels == 1 {
// Mic is Mono: Mix System L+R down to Mono
let sys_mono = (l + r) / 2.0;
let mixed = (mixed_data[f] * gain_mic) + (sys_mono * gain_sys);
mixed_data[f] = mixed.max(-1.0).min(1.0);
} else {
// Mic is Stereo: Mix L-to-L and R-to-R
let f_start = f * 2;
let mixed_l = (mixed_data[f_start] * gain_mic) + (l * gain_sys);
let mixed_r = (mixed_data[f_start + 1] * gain_mic) + (r * gain_sys);
mixed_data[f_start] = mixed_l.max(-1.0).min(1.0);
mixed_data[f_start + 1] = mixed_r.max(-1.0).min(1.0);
}
}
}
}
let data = &mixed_data;
// 1. Add to Ring Buffer (Interleaved data - Record EVERYTHING)
for &sample in data {
self.ring_buffer[self.ring_pos] = sample;
@@ -108,8 +162,7 @@ impl AudioProcessor {
// 2. Prepare VAD Signal (Mono Mixdown)
// FRESH START LOGIC (v0.2.0):
// We expect standard Stereo Input (BlackHole 2ch).
// No magic 3-channel aggregate.
// We expect standard Stereo Input.
let channels = self.channel_count as usize;
let frame_count = data.len() / channels;
@@ -146,7 +199,6 @@ impl AudioProcessor {
self.vad_buffer.extend_from_slice(&waves_out[0][0..out_len]);
}
}
// Update output buffer usage... logic is tricky with drain.
}
// 4. Process VAD
@@ -155,21 +207,15 @@ impl AudioProcessor {
// Run Detection
let probability = self.vad.predict(vad_chunk.clone());
// Calculate RMS for this chunk to use as fallback/hybrid detection
let sq_sum: f32 = vad_chunk.iter().map(|x| x * x).sum();
let rms = (sq_sum / vad_chunk.len() as f32).sqrt();
let system_is_active = max_system_energy > 0.005; // Lowered to match trigger
let is_speech = probability > 0.9;
// Hybrid VAD: Probability > 0.9 OR RMS > 0.025
// INCREASED THRESHOLDS (v1.1.1):
// Reduced sensitivity to avoid background noise triggering recording.
let is_speech = probability > 0.9 || rms > 0.025;
if is_speech {
if is_speech || system_is_active {
self.is_speech_active = true;
self.last_speech_time = self.total_processed_samples;
}
// Emit VAD event periodically (every 500ms is enough for non-diagnostic mode)
// Emit VAD event periodically
if self.last_event_time.elapsed().as_millis() > 500 {
if let Some(app) = &self.app_handle {
#[derive(Clone, serde::Serialize)]
@@ -183,11 +229,6 @@ impl AudioProcessor {
});
}
self.last_event_time = std::time::Instant::now();
// IMPORTANT: We reset is_speech_active after emitting,
// so we don't latch it forever if the user stops talking.
// However, the main loop sets it to true if current chunk is speech.
// This logic is a bit of a "latch for X ms".
self.is_speech_active = false;
}
}
@@ -195,9 +236,32 @@ impl AudioProcessor {
// 4. Update Hangover and Check Write condition
if self.waiting_for_speech {
if self.is_speech_active {
// TRIGGER CONDITION:
// 1. VAD says speech (Someone is talking)
// 2. AND System Audio has energy (Meaning audio is coming from the PC, i.e., Call started)
// Threshold 0.01 is roughly -40dB, should cover ringtones/speech easily but ignore silence/hiss.
let system_active = max_system_energy > 0.005;
// Periodically log energy to help debug why meeting mode might not start
if self.last_event_time.elapsed().as_millis() > 2000 && self.recording_mode == "meeting" {
if let Some(app) = &self.app_handle {
emit_log(app, "DEBUG", &format!("Waiting for Meeting... Current System Energy: {:.4} (Threshold: 0.005)", max_system_energy));
}
}
// MODE-SPECIFIC TRIGGER LOGIC:
// "voice" -> Trigger if user speaks (is_speech_active)
// "meeting" -> Trigger ONLY if system audio energy detected (Call starting)
let trigger = if self.recording_mode == "voice" {
self.is_speech_active
} else {
system_active
};
if trigger {
// Trigger Detected!
println!("Auto-Start: Speech detected. Flushing pre-roll...");
println!("Auto-Start: Call detected (SysEnergy: {}). Flushing pre-roll...", max_system_energy);
self.waiting_for_speech = false;
// Flush Ring Buffer (Orderly: from ring_pos to end, then 0 to ring_pos)
@@ -229,7 +293,13 @@ impl AudioProcessor {
// Standard Recording Logic (Active or Hangover)
let time_since_speech = self.total_processed_samples.saturating_sub(self.last_speech_time);
if self.is_speech_active || time_since_speech < self.hangover_samples {
// We write to file if:
// 1. VAD thinks someone is speaking (Mic or System)
// 2. OR System audio energy is currently above threshold (Ensures calls are captured)
// 3. OR we are within the hangover period
let system_is_active = max_system_energy > 0.005;
if self.is_speech_active || system_is_active || time_since_speech < self.hangover_samples {
let mut guard = self.writer.lock().unwrap();
for &sample in data {
let amplitude = i16::MAX as f32;

View File

@@ -15,11 +15,13 @@ mod audio_processor;
use audio_processor::AudioProcessor;
mod auth;
mod email;
mod sc_audio;
// State to hold the active recording stream
struct AppState {
recording_stream: Mutex<Option<cpal::Stream>>,
recording_file_path: Mutex<Option<String>>,
system_capture: Mutex<Option<sc_audio::SystemAudioCapture>>,
}
#[derive(serde::Serialize)]
@@ -35,7 +37,7 @@ struct LogEvent {
timestamp: String,
}
fn emit_log(app: &AppHandle, level: &str, message: &str) {
pub(crate) fn emit_log(app: &AppHandle, level: &str, message: &str) {
let log = LogEvent {
level: level.to_string(),
message: message.to_string(),
@@ -71,8 +73,8 @@ fn get_input_devices() -> Result<Vec<AudioDevice>, String> {
#[tauri::command]
fn start_recording(app: AppHandle, state: State<'_, AppState>, device_id: String, save_path: Option<String>, custom_filename: Option<String>, wait_for_speech: Option<bool>) -> Result<(), String> {
emit_log(&app, "INFO", &format!("Starting recording on device: {}", device_id));
async fn start_recording(app: AppHandle, state: State<'_, AppState>, device_id: String, save_path: Option<String>, custom_filename: Option<String>, wait_for_speech: Option<bool>, mode: String) -> Result<(), String> {
emit_log(&app, "INFO", &format!("Starting recording [Mode: {}] on device: {}", mode, device_id));
let host = cpal::default_host();
// Find device by name (using name as ID)
@@ -85,13 +87,23 @@ fn start_recording(app: AppHandle, state: State<'_, AppState>, device_id: String
// Select the configuration with the MAXIMUM number of channels
// This is crucial for "Hearbit Audio" (Aggregate) which lists 3 channels but might default to 2.
// We want the raw 3 channels to separate Mic (Ch0) from System (Ch1+2).
let supported_configs = device.supported_input_configs().map_err(|e| e.to_string())?;
let config = supported_configs
.max_by_key(|c| c.channels())
.map(|c| c.with_max_sample_rate())
// Select Audio Configuration
// We prioritize 48kHz because System Audio (ScreenCaptureKit) acts best at 48k.
let supported_configs: Vec<_> = device.supported_input_configs().map_err(|e| e.to_string())?.collect();
// Try to find 48kHz specifically
// Note: cpal::SampleRate is likely a type alias for u32 here, so we pass 48000 directly.
let config = supported_configs.iter()
.find(|c| c.min_sample_rate() <= 48000 && c.max_sample_rate() >= 48000)
.map(|c| c.with_sample_rate(48000))
.or_else(|| {
// Fallback: Max sample rate
supported_configs.iter()
.max_by_key(|c| c.channels())
.map(|c| c.with_max_sample_rate())
})
.ok_or("No supported input configurations found")?;
emit_log(&app, "INFO", &format!("Selected Audio Config: {} Channels, {} Hz", config.channels(), config.sample_rate()));
let spec = hound::WavSpec {
@@ -131,10 +143,10 @@ fn start_recording(app: AppHandle, state: State<'_, AppState>, device_id: String
// We pass the writer to it.
let should_wait = wait_for_speech.unwrap_or(false);
if should_wait {
emit_log(&app, "INFO", "Recording started in WAITING mode (buffer-only until speech).");
emit_log(&app, "INFO", &format!("Recording started in WAITING mode (Trigger: {}).", if mode == "voice" { "Speech" } else { "System Audio" }));
}
let processor = AudioProcessor::new(config.sample_rate(), config.channels(), writer.clone(), app.clone(), should_wait)
let processor = AudioProcessor::new(config.sample_rate(), config.channels(), writer.clone(), app.clone(), should_wait, mode)
.map_err(|e| format!("Failed to create AudioProcessor: {}", e))?;
// Wrap processor in Arc<Mutex> so we can share/move it into callback
@@ -145,6 +157,43 @@ fn start_recording(app: AppHandle, state: State<'_, AppState>, device_id: String
let processor = Arc::new(Mutex::new(processor));
let processor_clone = processor.clone();
// --- SYSTEM AUDIO CAPTURE START ---
// Prevent Doubling: If user selected an aggregate device (Hearbit Audio/BlackHole),
// it ALREADY contains system audio. In that case, we don't need internal SCK capture.
let is_aggregate = device_id.contains("Hearbit") || device_id.contains("BlackHole");
if is_aggregate {
emit_log(&app, "INFO", "Aggregate device detected. Disabling internal System Audio Capture to prevent doubling.");
} else {
let mut sys_capture = sc_audio::SystemAudioCapture::new(config.sample_rate());
// Get the queue to share with the capture callback
let queue_clone = {
let p = processor.lock().unwrap();
p.system_queue.clone() // Access the pub field we added
};
let sys_callback = move |data: &[f32]| {
// Push to queue
if let Ok(mut q) = queue_clone.lock() {
q.extend(data.iter());
// Limit queue size to avoid memory leaks if main process loop is slow
while q.len() > 48000 * 5 { // 5 seconds buffer
q.pop_front();
}
}
};
match sys_capture.start(sys_callback).await {
Ok(_) => emit_log(&app, "INFO", "System Audio Capture started."),
Err(e) => emit_log(&app, "WARN", &format!("System Audio Capture failed (Permissions?): {}", e)),
}
*state.system_capture.lock().unwrap() = Some(sys_capture);
}
// --- SYSTEM AUDIO CAPTURE END ---
let app_handle = app.clone();
let err_fn = move |err| {
eprintln!("an error occurred on stream: {}", err);
@@ -206,6 +255,13 @@ fn stop_recording(app: AppHandle, state: State<'_, AppState>) -> Result<String,
// Drop stream to stop recording
{
let mut stream_guard = state.recording_stream.lock().unwrap();
// Also stop System Capture
let mut sys_guard = state.system_capture.lock().unwrap();
if let Some(sys) = sys_guard.as_mut() {
sys.stop();
}
*sys_guard = None;
if stream_guard.is_none() {
return Err("Not recording".to_string());
}
@@ -508,8 +564,9 @@ async fn poll_transcription(app: &AppHandle, client: &reqwest::Client, api_key:
return Err(format!("Download failed: {}", dl_res.status()));
}
} else if status == "failed" || status == "error" {
emit_log(app, "ERROR", &format!("Batch processing failed: {:?}", json));
return Err(format!("Batch processing failed: {:?}", json));
let err_msg = format!("Batch processing failed [Status: {}]. Full Response: {:?}", status, json);
emit_log(app, "ERROR", &err_msg);
return Err(err_msg);
}
// If 'processing' or 'pending', continue loop
}
@@ -804,6 +861,12 @@ fn create_hearbit_audio_device(app: AppHandle) -> Result<String, String> {
}
}
#[tauri::command]
async fn check_screen_recording_permission() -> bool {
sc_audio::check_permissions().await
}
#[tauri::command]
async fn save_text_file(app: AppHandle, path: String, content: String) -> Result<(), String> {
emit_log(&app, "INFO", &format!("Saving text file to: {}", path));
@@ -891,6 +954,7 @@ pub fn run() {
.manage(AppState {
recording_stream: Mutex::new(None),
recording_file_path: Mutex::new(None),
system_capture: Mutex::new(None),
})
.invoke_handler(tauri::generate_handler![
greet,
@@ -904,6 +968,7 @@ pub fn run() {
get_available_models,
open_audio_midi_setup,
create_hearbit_audio_device,
check_screen_recording_permission,
auth::start_auth_flow,
auth::get_calendar_events,
save_text_file,

103
src-tauri/src/sc_audio.rs Normal file
View File

@@ -0,0 +1,103 @@
use screencapturekit_sys::{
os_types::rc::Id,
shareable_content::UnsafeSCShareableContent,
content_filter::{UnsafeContentFilter, UnsafeInitParams},
stream_configuration::UnsafeStreamConfiguration,
stream::UnsafeSCStream,
stream_error_handler::UnsafeSCStreamError,
stream_output_handler::UnsafeSCStreamOutput,
cm_sample_buffer_ref::CMSampleBufferRef,
};
pub struct SystemAudioCapture {
stream: Option<Id<UnsafeSCStream>>,
sample_rate: u32,
}
struct AudioOutputWrapper {
callback: Box<dyn Fn(&[f32]) + Send + Sync>,
}
impl UnsafeSCStreamOutput for AudioOutputWrapper {
fn did_output_sample_buffer(&self, sample: Id<CMSampleBufferRef>, of_type: u8) {
if of_type == 1 { // Audio
let buffers = sample.get_av_audio_buffer_list();
for buffer in buffers {
// Buffer data is u8, we usually get F32 from SCK if configured.
// Assuming f32 (Floating Point) based on our config.
// We need to convert [u8] to [f32].
let data_u8 = buffer.data;
let data_f32: &[f32] = unsafe {
std::slice::from_raw_parts(
data_u8.as_ptr() as *const f32,
data_u8.len() / 4,
)
};
(self.callback)(data_f32);
}
}
}
}
struct ErrorHandler;
impl UnsafeSCStreamError for ErrorHandler {
fn handle_error(&self) {
// eprintln!("Stream Error");
}
}
pub async fn check_permissions() -> bool {
UnsafeSCShareableContent::get().is_ok()
}
impl SystemAudioCapture {
pub fn new(sample_rate: u32) -> Self {
Self { stream: None, sample_rate }
}
pub async fn start<F>(&mut self, callback: F) -> Result<(), String>
where F: Fn(&[f32]) + Send + Sync + 'static {
let content = UnsafeSCShareableContent::get().map_err(|e| format!("Failed to get content"))?;
let displays = content.displays();
let display = displays.first().ok_or("No display found")?;
let filter_init = UnsafeInitParams::Display(display.clone());
let filter = UnsafeContentFilter::init(filter_init);
// Wait, 'pixel_format' is OSType. b"BGRA" is &[u8;4].
// FourCharCode::from_chars exists in crate::os_types::four_char_code but we didn't import it.
// Actually, we can just use the Default and overwrite fields.
// But better: use Default and only set what we need.
let mut config = UnsafeStreamConfiguration::default();
config.width = 100;
config.height = 100;
config.captures_audio = 1;
config.sample_rate = self.sample_rate;
config.channel_count = 2;
config.excludes_current_process_audio = 0;
let output_wrapper = AudioOutputWrapper {
callback: Box::new(callback),
};
// Convert config to Id<UnsafeStreamConfigurationRef> using Into
let stream = UnsafeSCStream::init(filter, config.into(), ErrorHandler);
stream.add_stream_output(output_wrapper, 1); // 1 = Audio
stream.start_capture().map_err(|e| "Failed to start capture".to_string())?;
self.stream = Some(stream);
Ok(())
}
pub fn stop(&mut self) {
if let Some(stream) = &self.stream {
stream.stop_capture();
}
self.stream = None;
}
}

View File

@@ -1,7 +1,7 @@
{
"$schema": "https://schema.tauri.app/config/2",
"productName": "Hearbit AI",
"version": "1.1.1",
"version": "1.2.1",
"identifier": "com.hearbit-ai.desktop",
"build": {
"beforeDevCommand": "npm run dev",

View File

@@ -1,4 +1,5 @@
import { useState } from 'react';
import { useState, useEffect, useCallback } from 'react';
import { invoke } from '@tauri-apps/api/core';
import { Settings as SettingsIcon } from "lucide-react";
import Settings, { SmtpConfig, AzureConfig } from "./components/Settings";
import Recorder from "./components/Recorder";
@@ -60,6 +61,11 @@ function App() {
return localStorage.getItem('hearbit_selected_model') || 'mixtral';
});
// Daily Backup State
const [dailyBackupEnabled, setDailyBackupEnabled] = useState(() => localStorage.getItem('hearbit_daily_backup_enabled') === 'true');
const [dailyBackupPath, setDailyBackupPath] = useState(() => localStorage.getItem('hearbit_daily_backup_path') || '');
const [lastBackupDate, setLastBackupDate] = useState(() => localStorage.getItem('hearbit_last_backup_date') || '');
const handleModelChange = (model: string) => {
setSelectedModel(model);
localStorage.setItem('hearbit_selected_model', model);
@@ -227,6 +233,7 @@ Thanks!`
return saved ? JSON.parse(saved) : defaultEmailTemplates;
});
const handleSaveSettings = (
newApiKey: string,
newProductId: string,
@@ -234,7 +241,9 @@ Thanks!`
newSavePath: string,
newSmtp: SmtpConfig,
newAzure: AzureConfig,
newEmailTemplates: EmailTemplate[]
newEmailTemplates: EmailTemplate[],
newDailyBackupEnabled: boolean,
newDailyBackupPath: string
) => {
setApiKey(newApiKey);
setProductId(newProductId);
@@ -244,14 +253,20 @@ Thanks!`
setAzureConfig(newAzure);
setEmailTemplates(newEmailTemplates);
localStorage.setItem('infomaniak_api_key', newApiKey);
localStorage.setItem('infomaniak_product_id', newProductId);
localStorage.setItem('infomaniak_prompts', JSON.stringify(newPrompts));
localStorage.setItem('infomaniak_save_path', newSavePath);
setDailyBackupEnabled(newDailyBackupEnabled);
setDailyBackupPath(newDailyBackupPath);
localStorage.setItem('hearbit_api_key', newApiKey);
localStorage.setItem('hearbit_product_id', newProductId);
localStorage.setItem('hearbit_prompts', JSON.stringify(newPrompts));
localStorage.setItem('hearbit_save_path', newSavePath);
localStorage.setItem('hearbit_smtp_config', JSON.stringify(newSmtp));
localStorage.setItem('hearbit_azure_config', JSON.stringify(newAzure));
localStorage.setItem('hearbit_email_templates', JSON.stringify(newEmailTemplates));
localStorage.setItem('hearbit_daily_backup_enabled', String(newDailyBackupEnabled));
localStorage.setItem('hearbit_daily_backup_path', newDailyBackupPath);
setView(lastTab);
};
@@ -332,6 +347,80 @@ Thanks!`
setView('transcription'); // Switch to Transcription view to see content
};
const performBackup = useCallback(async (isAuto = false) => {
try {
if (isAuto && !dailyBackupEnabled) return;
const dataToBackup = {
apiKey,
productId,
prompts,
savePath,
smtp: smtpConfig,
azure: azureConfig,
emailTemplates,
history, // Including history!
// Also include daily backup settings so they persist on restore
dailyBackup: {
enabled: dailyBackupEnabled,
path: dailyBackupPath,
}
};
// Always save as JSON (no encryption)
const content = JSON.stringify(dataToBackup, null, 2);
const dateStr = new Date().toISOString().slice(0, 10);
const fileName = `hearbit_backup_${isAuto ? 'auto_' : ''}${dateStr}.json`;
// Determine path: use specific daily backup path, or general savePath
const targetDir = (isAuto ? dailyBackupPath : savePath) || savePath;
if (!targetDir) {
if (!isAuto) addToast('No backup path configured.', 'error');
return;
}
const fullPath = `${targetDir}/${fileName}`;
await invoke('save_text_file', { path: fullPath, content });
if (isAuto) {
const now = new Date().toISOString();
setLastBackupDate(now);
localStorage.setItem('hearbit_last_backup_date', now);
console.log("Auto-backup completed:", fullPath);
} else {
addToast(`Backup saved to ${fullPath}`, 'success');
}
} catch (e) {
console.error("Backup failed:", e);
if (!isAuto) addToast(`Backup failed: ${e}`, 'error');
}
}, [apiKey, productId, prompts, savePath, smtpConfig, azureConfig, emailTemplates, history, dailyBackupEnabled, dailyBackupPath]);
// Check for Daily Backup on Mount / State Change
useEffect(() => {
if (!dailyBackupEnabled) return;
const check = async () => {
const today = new Date().toISOString().slice(0, 10);
const last = lastBackupDate ? lastBackupDate.slice(0, 10) : '';
if (last !== today) {
// Perform backup
await performBackup(true);
}
};
const timer = setTimeout(() => {
check();
}, 5000); // Check 5s after load to allow state to settle
return () => clearTimeout(timer);
}, [dailyBackupEnabled, lastBackupDate, performBackup]);
return (
@@ -474,6 +563,18 @@ Thanks!`
smtpConfig={smtpConfig}
azureConfig={azureConfig}
emailTemplates={emailTemplates}
// Pass new backup props
dailyBackupEnabled={dailyBackupEnabled}
dailyBackupPath={dailyBackupPath}
lastBackupDate={lastBackupDate}
// Pass history and update callback
history={history}
onHistoryUpdate={(newHistory) => {
setHistory(newHistory);
localStorage.setItem('infomaniak_history', JSON.stringify(newHistory));
}}
/>
)}
</div>

View File

@@ -60,9 +60,9 @@ const Recorder: React.FC<RecorderProps> = ({
const [isStopping, setIsStopping] = useState(false); // New lock state
const [isPaused, setIsPaused] = useState(false);
const [isWaiting, setIsWaiting] = useState(false); // New state for Auto-Start
const [hasSpeechDetected, setHasSpeechDetected] = useState(false); // New tracking state
const [autoStartEnabled, setAutoStartEnabled] = useState(false); // Toggle state
const [status, setStatus] = useState<string>('Ready to record');
const [selectedDevice, setSelectedDevice] = useState<string>('');
const [selectedPromptId, setSelectedPromptId] = useState<string>('');
@@ -73,11 +73,8 @@ const Recorder: React.FC<RecorderProps> = ({
const [lastSpeechTime, setLastSpeechTime] = useState<number>(Date.now());
const [silenceDuration, setSilenceDuration] = useState(0);
// Filtered devices based on mode
const filteredDevices = devices.filter(d => {
const isVirtual = d.name.toLowerCase().includes('hearbit') || d.name.toLowerCase().includes('blackhole');
return recordingMode === 'meeting' ? isVirtual : !isVirtual;
});
// Show all devices for both modes now (System Audio is captured natively)
const filteredDevices = devices;
useEffect(() => {
loadDevices();
@@ -126,15 +123,18 @@ const Recorder: React.FC<RecorderProps> = ({
const aggregateDev = aliasedDevs.find(d => d.name === 'Hearbit Audio');
const virtualDev = aliasedDevs.find(d => d.name.includes('Hearbit Virtual'));
if (aggregateDev) {
setRecordingMode('meeting');
setSelectedDevice(aggregateDev.id);
} else if (virtualDev) {
setRecordingMode('meeting');
setSelectedDevice(virtualDev.id);
} else {
setRecordingMode('voice');
if (aliasedDevs.length > 0) setSelectedDevice(aliasedDevs[0].id);
if (recordingMode === 'meeting') {
if (aggregateDev) {
setSelectedDevice(aggregateDev.id);
} else if (virtualDev) {
setSelectedDevice(virtualDev.id);
} else if (aliasedDevs.length > 0) {
setSelectedDevice(aliasedDevs[0].id);
}
} else if (aliasedDevs.length > 0) {
// Voice mode: just pick first non-virtual if possible, otherwise first
const physicalMic = aliasedDevs.find(d => !d.name.includes('Hearbit') && !d.name.includes('BlackHole'));
setSelectedDevice(physicalMic ? physicalMic.id : aliasedDevs[0].id);
}
}
} catch (e) {
@@ -163,13 +163,15 @@ const Recorder: React.FC<RecorderProps> = ({
deviceId: targetDeviceId,
savePath: savePath || null,
customFilename: props.recordingSubject || null,
waitForSpeech: autoStartEnabled // Pass the toggle state
waitForSpeech: autoStartEnabled, // Pass the toggle state
mode: recordingMode
});
setIsRecording(true);
setIsPaused(false);
setTranscription('');
setSummary('');
setHasSpeechDetected(false); // Reset check for new session
if (autoStartEnabled) {
setIsWaiting(true);
@@ -189,9 +191,10 @@ const Recorder: React.FC<RecorderProps> = ({
}
};
// Refs for interval access to avoid dependency cycles
// Refs for interval access to avoid dependency cycles and stale closures
const lastSpeechTimeRef = useRef<number>(Date.now());
const isStoppingRef = useRef(false);
const autoStartEnabledRef = useRef(autoStartEnabled);
// Update refs when state changes
useEffect(() => {
@@ -202,6 +205,10 @@ const Recorder: React.FC<RecorderProps> = ({
isStoppingRef.current = isStopping;
}, [isStopping]);
useEffect(() => {
autoStartEnabledRef.current = autoStartEnabled;
}, [autoStartEnabled]);
// 1. Event Listeners Effect (Run ONCE when recording starts)
useEffect(() => {
let unlistenVAD: () => void;
@@ -215,15 +222,16 @@ const Recorder: React.FC<RecorderProps> = ({
unlistenVAD = await listen<{ is_speech: boolean, probability: number }>('vad-event', (event) => {
if (event.payload.is_speech) {
setLastSpeechTime(Date.now());
lastSpeechTimeRef.current = Date.now(); // Update ref immediately
lastSpeechTimeRef.current = Date.now();
setSilenceDuration(0);
setHasSpeechDetected(true); // Track positive speech
}
});
// Auto-Start Trigger Listener
unlistenTrigger = await listen('auto-recording-triggered', () => {
console.log("Auto-Start Triggered from Backend!");
// Only trigger if we are actually waiting
setHasSpeechDetected(true); // Trigger counts as speech
setIsWaiting((prev) => {
if (prev) {
addToast("Audio detected! Recording started.", 'success', 4000);
@@ -264,7 +272,7 @@ const Recorder: React.FC<RecorderProps> = ({
// AUTO STOP Logic
// Use Ref to get LATEST visibility instantly
if (isVisibleRef.current && timeSinceSpeech > 20 && !isStoppingRef.current) {
if (isVisibleRef.current && timeSinceSpeech > 25 && !isStoppingRef.current) {
console.log("Auto-stopping due to silence...");
isStoppingRef.current = true;
addToast('Auto-stopped due to silence', 'info');
@@ -341,134 +349,163 @@ const Recorder: React.FC<RecorderProps> = ({
setIsRecording(false);
setIsPaused(false);
setIsWaiting(false); // Reset waiting state
setTranscription('');
setSummary('');
setHasSpeechDetected(false); // Reset checkiting state
setStatus('Saving recording...');
const filePath = await invoke<string>('stop_recording');
// Wait a moment for file flush (safety)
await new Promise(r => setTimeout(r, 500));
// NEW: Check if speech was actually detected during the session
// If we recorded 20s of silence (Auto-Stop), we shouldn't transcribe.
// IMPORTANT: Check applies to BOTH 'voice' and 'meeting' modes to prevent "Batch Null" errors on false triggers.
if (!hasSpeechDetected) {
console.log("No speech detected during recording. Skipping transcription.");
addToast("Recording discarded (No speech/audio detected)", 'info');
setStatus('Ready to record'); // Reset status text
// Confirm recording saved
addToast(`Recording saved locally: ${filePath.split('/').pop()}`, 'success', 3000);
setStatus('Converting to MP3...');
// If auto-start is on, we just loop back (in finally block).
// But we skip the expensive/failing API call.
} else {
// Small delay to show the "saved" message
await new Promise(r => setTimeout(r, 500));
// Wait a moment for file flush (safety)
await new Promise(r => setTimeout(r, 500));
// Convert WAV to MP3 for smaller size
const mp3Path = await invoke<string>('convert_to_mp3', { wavPath: filePath });
// Confirm recording saved
addToast(`Recording saved locally: ${filePath.split('/').pop()}`, 'success', 3000);
setStatus('Converting to MP3...');
// Get file size to check if chunking needed
interface AudioMetadata { duration: number; size: number; format: string; }
const metadata = await invoke<AudioMetadata>('get_audio_metadata', { filePath: mp3Path });
const sizeMB = metadata.size / (1024 * 1024);
// Small delay to show the "saved" message
await new Promise(r => setTimeout(r, 500));
let transText = '';
// Convert WAV to MP3 for smaller size
const mp3Path = await invoke<string>('convert_to_mp3', { wavPath: filePath });
// Check if chunking needed (only for Meeting mode and large files)
if (recordingMode === 'meeting' && sizeMB >= 18) {
// CHUNKING PATH for large meetings
setStatus(`Large file (${sizeMB.toFixed(1)}MB). Splitting into chunks...`);
const chunks = await invoke<string[]>('chunk_audio', {
filePath: mp3Path,
chunkMinutes: 10
});
// Get file size to check if chunking needed
interface AudioMetadata { duration: number; size: number; format: string; }
const metadata = await invoke<AudioMetadata>('get_audio_metadata', { filePath: mp3Path });
const sizeMB = metadata.size / (1024 * 1024);
addToast(`Processing ${chunks.length} chunks...`, 'info', 4000);
let transText = '';
let allTranscriptions: string[] = [];
// Check if chunking needed (only for Meeting mode and large files)
if (recordingMode === 'meeting' && sizeMB >= 18) {
// CHUNKING PATH for large meetings
setStatus(`Large file (${sizeMB.toFixed(1)}MB). Splitting into chunks...`);
const chunks = await invoke<string[]>('chunk_audio', {
filePath: mp3Path,
chunkMinutes: 10
});
for (let i = 0; i < chunks.length; i++) {
setStatus(`Transcribing chunk ${i + 1}/${chunks.length}...`);
const chunkText = await invoke<string>('transcribe_audio', {
filePath: chunks[i],
addToast(`Processing ${chunks.length} chunks...`, 'info', 4000);
let allTranscriptions: string[] = [];
for (let i = 0; i < chunks.length; i++) {
setStatus(`Transcribing chunk ${i + 1}/${chunks.length}...`);
const chunkText = await invoke<string>('transcribe_audio', {
filePath: chunks[i],
apiKey,
productId
});
allTranscriptions.push(chunkText);
}
// Merge transcriptions
transText = allTranscriptions.join('\n\n--- Next Segment ---\n\n');
addToast('All chunks transcribed successfully!', 'success', 3000);
} else {
// NORMAL PATH for small files
setStatus('Transcribing (Infomaniak Whisper)...');
transText = await invoke<string>('transcribe_audio', {
filePath: mp3Path,
apiKey,
productId
});
allTranscriptions.push(chunkText);
}
// Merge transcriptions
transText = allTranscriptions.join('\n\n--- Next Segment ---\n\n');
addToast('All chunks transcribed successfully!', 'success', 3000);
} else {
// NORMAL PATH for small files
setStatus('Transcribing (Infomaniak Whisper)...');
transText = await invoke<string>('transcribe_audio', {
filePath: mp3Path,
apiKey,
productId
});
}
setTranscription(transText);
setTranscription(transText);
// Check if transcription is empty or just whitespace
if (!transText || transText.trim().length === 0) {
setStatus('Done (No speech detected)');
setTranscription('(No speech detected. Check your microphone settings.)');
setTimeout(() => setStatus('Ready to record'), 3000);
// allow finally block to restart loop
} else {
// Logic continues...
// Check if transcription is empty or just whitespace
if (!transText || transText.trim().length === 0) {
setStatus('Done (No speech detected)');
setTranscription('(No speech detected. Check your microphone settings.)');
setTimeout(() => setStatus('Ready to record'), 3000);
return;
}
// Find selected prompt content - SMART SELECTION
let activePrompt = prompts.find(p => p.id === selectedPromptId);
// Find selected prompt content - SMART SELECTION
let activePrompt = prompts.find(p => p.id === selectedPromptId);
// Smart Auto-Select based on keywords
const lowerText = transText.toLowerCase();
let bestMatchId = selectedPromptId;
let maxMatches = 0;
// Smart Auto-Select based on keywords
const lowerText = transText.toLowerCase();
let bestMatchId = selectedPromptId;
let maxMatches = 0;
for (const p of prompts) {
if (!p.keywords) continue;
let matches = 0;
for (const kw of p.keywords) {
if (lowerText.includes(kw.toLowerCase())) {
matches++;
for (const p of prompts) {
if (!p.keywords) continue;
let matches = 0;
for (const kw of p.keywords) {
if (lowerText.includes(kw.toLowerCase())) {
matches++;
}
}
if (matches > maxMatches) {
maxMatches = matches;
bestMatchId = p.id;
}
}
}
if (matches > maxMatches) {
maxMatches = matches;
bestMatchId = p.id;
if (bestMatchId !== selectedPromptId) {
const newPrompt = prompts.find(p => p.id === bestMatchId);
if (newPrompt) {
console.log(`Smart Select: Switched to '${newPrompt.name}' with ${maxMatches} matches.`);
setStatus(`Smart Select: Using "${newPrompt.name}"...`);
addToast(`Smart Select: Switched to "${newPrompt.name}"`, 'success', 4000);
activePrompt = newPrompt;
}
}
const promptContent = activePrompt ? activePrompt.content : "Summarize this.";
setStatus(`Summarizing (${selectedModel})...`);
const sumText = await invoke<string>('summarize_text', {
text: transText,
apiKey,
productId,
prompt: promptContent,
model: selectedModel
});
setSummary(sumText);
// Auto-save to history
onSaveToHistory(transText, sumText);
setStatus('Done!');
addToast('Transcription & Summary complete!', 'success', 4000);
onRecordingComplete(); // Auto-switch tab
setTimeout(() => setStatus('Ready to record'), 3000);
}
}
if (bestMatchId !== selectedPromptId) {
const newPrompt = prompts.find(p => p.id === bestMatchId);
if (newPrompt) {
console.log(`Smart Select: Switched to '${newPrompt.name}' with ${maxMatches} matches.`);
setStatus(`Smart Select: Using "${newPrompt.name}"...`);
addToast(`Smart Select: Switched to "${newPrompt.name}"`, 'success', 4000);
activePrompt = newPrompt;
// Optional: Update UI selection? setSelectedPromptId(bestMatchId);
// Let's verify with user preference? For now, we override as "Magic".
}
}
const promptContent = activePrompt ? activePrompt.content : "Summarize this.";
setStatus(`Summarizing (${selectedModel})...`);
const sumText = await invoke<string>('summarize_text', {
text: transText,
apiKey,
productId,
prompt: promptContent,
model: selectedModel
});
setSummary(sumText);
// Auto-save to history
onSaveToHistory(transText, sumText);
setStatus('Done!');
addToast('Transcription & Summary complete!', 'success', 4000);
onRecordingComplete(); // Auto-switch tab
setTimeout(() => setStatus('Ready to record'), 3000);
} catch (e) {
console.error(e);
setStatus(`Error: ${e}`);
addToast(`Error processing: ${e}`, 'error');
} finally {
setIsStopping(false);
// AUTO-RESTART LOGIC
// Use REF to get the latest state (fix for "starts again even if I uncheck")
if (autoStartEnabledRef.current) {
console.log("Auto-Start enabled: Restarting listener loop...");
// Short delay to ensure backend cleanup
setTimeout(() => {
// Double check ref before restarting
if (autoStartEnabledRef.current) {
startRecording();
}
}, 1000);
}
}
};
@@ -634,12 +671,20 @@ const Recorder: React.FC<RecorderProps> = ({
</div>
<div className="flex flex-col gap-2 mt-2 w-full">
{recordingMode === 'meeting' && filteredDevices.length === 0 && (
{recordingMode === 'meeting' && (
<button
onClick={onOpenSettings}
onClick={async () => {
const allowed = await invoke<boolean>('check_screen_recording_permission');
if (allowed) {
addToast('System Audio Permission: GRANTED ✅', 'success');
} else {
addToast('System Audio Permission: MISSING ❌. Please enable in System Settings -> Privacy -> Screen Recording', 'error', 5000);
// Open Settings?
}
}}
className="text-xs bg-primary/10 text-primary hover:bg-primary/20 w-full text-center border border-primary/20 rounded p-2 mb-2 font-semibold"
>
🪄 Create "Hearbit Audio" Device
🔒 Check Audio Permission
</button>
)}
<button

View File

@@ -1,9 +1,7 @@
import React, { useState, useEffect } from 'react';
import { Save, FolderOpen, Lock, Upload, Download, Eye, EyeOff, Mail, FileText, ScrollText } from 'lucide-react';
import { Save, FolderOpen, Lock, Upload, Download, Mail, FileText, ScrollText } from 'lucide-react';
import { save, open } from '@tauri-apps/plugin-dialog';
// Removed writeTextFile as we use invoke 'save_text_file'
import { invoke } from '@tauri-apps/api/core';
import { encryptData, decryptData } from '../utils/backup';
import EmailTemplateEditor from './EmailTemplateEditor';
import logo from '../assets/logo.png';
@@ -17,6 +15,10 @@ interface SettingsProps {
emailTemplates: EmailTemplate[];
smtpConfig: SmtpConfig;
azureConfig: AzureConfig;
dailyBackupEnabled: boolean;
dailyBackupPath: string;
lastBackupDate: string;
history: any[];
onSave: (
apiKey: string,
productId: string,
@@ -24,8 +26,11 @@ interface SettingsProps {
savePath: string,
smtp: SmtpConfig,
azure: AzureConfig,
emailTemplates: EmailTemplate[]
emailTemplates: EmailTemplate[],
dailyBackupEnabled: boolean,
dailyBackupPath: string
) => void;
onHistoryUpdate: (history: any[]) => void;
onClose: () => void;
}
@@ -52,14 +57,10 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
const [localEmailTemplates, setLocalEmailTemplates] = useState<EmailTemplate[]>(props.emailTemplates); // New state
const [localSmtp, setLocalSmtp] = useState<SmtpConfig>(props.smtpConfig);
const [localAzure, setLocalAzure] = useState<AzureConfig>(props.azureConfig);
const [localDailyBackupEnabled, setLocalDailyBackupEnabled] = useState(props.dailyBackupEnabled);
const [localDailyBackupPath, setLocalDailyBackupPath] = useState(props.dailyBackupPath);
const [statusIdx, setStatusIdx] = useState<string | null>(null);
// Backup & Restore State
const [backupPassword, setBackupPassword] = useState('');
const [showPassword, setShowPassword] = useState(false);
const [isImportModalOpen, setIsImportModalOpen] = useState(false);
const [importFileContent, setImportFileContent] = useState<string | null>(null);
// Email Template Editor State
const [editingTemplate, setEditingTemplate] = useState<EmailTemplate | null>(null);
const [isEmailEditorOpen, setIsEmailEditorOpen] = useState(false);
@@ -133,7 +134,17 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
};
const handleSave = () => {
onSave(localApiKey, localProductId, localPrompts, localSavePath, localSmtp, localAzure, localEmailTemplates);
onSave(
localApiKey,
localProductId,
localPrompts,
localSavePath,
localSmtp,
localAzure,
localEmailTemplates,
localDailyBackupEnabled,
localDailyBackupPath
);
onClose();
};
@@ -154,10 +165,6 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
};
const handleExport = async () => {
if (!backupPassword) {
setStatusIdx('Error: Password required to encrypt backup.');
return;
}
try {
const data = {
apiKey: localApiKey,
@@ -165,21 +172,28 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
prompts: localPrompts,
savePath: localSavePath,
smtp: localSmtp,
azure: localAzure
azure: localAzure,
emailTemplates: localEmailTemplates,
history: props.history,
dailyBackup: {
enabled: localDailyBackupEnabled,
path: localDailyBackupPath,
}
};
const encrypted = await encryptData(data, backupPassword);
// Always save as JSON (no encryption)
const content = JSON.stringify(data, null, 2);
const filePath = await save({
defaultPath: `hearbit_backup_${new Date().toISOString().slice(0, 10)}.conf`,
defaultPath: `hearbit_backup_${new Date().toISOString().slice(0, 10)}.json`,
filters: [{
name: 'Hearbit Config',
extensions: ['conf']
extensions: ['json']
}]
});
if (filePath) {
// Use backend invoke to write file (bypasses fs scope issues)
await invoke('save_text_file', { path: filePath, content: encrypted });
await invoke('save_text_file', { path: filePath, content });
setStatusIdx(`Configuration exported to: ${filePath}`);
}
} catch (e) {
@@ -199,41 +213,39 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
const reader = new FileReader();
reader.onload = (event) => {
if (event.target?.result) {
setImportFileContent(event.target.result as string);
setIsImportModalOpen(true);
setBackupPassword('');
const content = event.target.result as string;
// Directly import without password modal since we don't use encryption
try {
const data = JSON.parse(content);
if (data.apiKey) setLocalApiKey(data.apiKey);
if (data.productId) setLocalProductId(data.productId);
if (data.prompts) setLocalPrompts(data.prompts);
if (data.emailTemplates) setLocalEmailTemplates(data.emailTemplates);
if (data.savePath) setLocalSavePath(data.savePath);
if (data.smtp) setLocalSmtp(data.smtp);
if (data.azure) setLocalAzure(data.azure);
if (data.dailyBackup) {
if (data.dailyBackup.enabled !== undefined) setLocalDailyBackupEnabled(data.dailyBackup.enabled);
if (data.dailyBackup.path) setLocalDailyBackupPath(data.dailyBackup.path);
}
// Import history!
if (data.history && Array.isArray(data.history)) {
props.onHistoryUpdate(data.history);
}
setStatusIdx('Configuration imported! Click Save to apply.');
} catch (error) {
console.error(error);
setStatusIdx(`Import failed: ${error}`);
}
}
};
reader.readAsText(file);
e.target.value = '';
};
const confirmImport = async () => {
if (!backupPassword) {
setStatusIdx('Error: Password required to decrypt.');
return;
}
if (!importFileContent) return;
try {
const data = await decryptData(importFileContent, backupPassword);
if (data.apiKey) setLocalApiKey(data.apiKey);
if (data.productId) setLocalProductId(data.productId);
if (data.prompts) setLocalPrompts(data.prompts);
if (data.emailTemplates) setLocalEmailTemplates(data.emailTemplates);
if (data.savePath) setLocalSavePath(data.savePath);
if (data.smtp) setLocalSmtp(data.smtp);
if (data.azure) setLocalAzure(data.azure);
setStatusIdx('Configuration imported! Click Save to apply.');
setIsImportModalOpen(false);
setImportFileContent(null);
} catch (e) {
console.error(e);
setStatusIdx('Import failed: Wrong password or corrupted file.');
}
};
const handleCreateDevice = async () => {
try {
@@ -257,49 +269,6 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
return (
<div className="flex flex-col h-full bg-background font-mono text-sm relative">
{/* Import Password Modal */}
{isImportModalOpen && (
<div className="absolute inset-0 z-50 bg-black/50 flex items-center justify-center p-4">
<div className="bg-background border border-border rounded-lg shadow-xl p-6 w-full max-w-sm space-y-4">
<div className="flex items-center gap-2 text-foreground font-semibold">
<Lock size={16} /> Import Configuration
</div>
<p className="text-muted-foreground text-xs">
Enter the password used to encrypt this backup file.
</p>
<div className="relative">
<input
type={showPassword ? "text" : "password"}
value={backupPassword}
onChange={(e) => setBackupPassword(e.target.value)}
placeholder="Backup Password"
className="w-full p-2 pr-8 rounded border border-border bg-secondary text-foreground focus:ring-2 focus:ring-primary outline-none"
/>
<button
onClick={() => setShowPassword(!showPassword)}
className="absolute right-2 top-2.5 text-muted-foreground hover:text-foreground"
>
{showPassword ? <EyeOff size={14} /> : <Eye size={14} />}
</button>
</div>
<div className="flex justify-end gap-2 pt-2">
<button
onClick={() => setIsImportModalOpen(false)}
className="px-3 py-1.5 text-xs font-medium rounded border border-border hover:bg-secondary text-foreground transition-colors"
>
Cancel
</button>
<button
onClick={confirmImport}
disabled={!backupPassword}
className="px-3 py-1.5 text-xs font-medium rounded bg-primary text-primary-foreground hover:bg-primary/90 transition-colors disabled:opacity-50"
>
Decrypt & Import
</button>
</div>
</div>
</div>
)}
{/* Email Template Editor Modal */}
<EmailTemplateEditor
isOpen={isEmailEditorOpen}
@@ -471,7 +440,7 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
onClick={() => removePrompt(prompt.id)}
className="absolute top-2 right-2 text-muted-foreground hover:text-destructive opacity-0 group-hover:opacity-100 transition-opacity text-xs flex items-center gap-1"
>
<EyeOff size={14} /> Remove
Remove
</button>
<input
type="text"
@@ -619,31 +588,13 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
{activeTab === 'backup' && (
<div className="space-y-6 max-w-xl">
{/* Manual Configuration Backup */}
<div className="space-y-4">
<h3 className="text-foreground font-semibold border-b border-border pb-2">Configuration Backup</h3>
<h3 className="text-foreground font-semibold border-b border-border pb-2">Manual Configuration Backup</h3>
<p className="text-xs text-muted-foreground">
Securely export your settings, including API keys and prompts. You must set a password to encrypt the backup file.
Export all your settings, including API keys, prompts, email templates, and history as JSON files.
</p>
<div className="relative">
<label className="block text-xs font-semibold text-muted-foreground mb-1 uppercase tracking-wide">
Encryption Password
</label>
<input
type={showPassword ? "text" : "password"}
value={backupPassword}
onChange={(e) => setBackupPassword(e.target.value)}
placeholder="Enter a strong password"
className="w-full p-2 pr-8 rounded border border-border bg-secondary text-foreground focus:ring-2 focus:ring-primary outline-none text-sm"
/>
<button
onClick={() => setShowPassword(!showPassword)}
className="absolute right-2 top-8 text-muted-foreground hover:text-foreground"
>
{showPassword ? <EyeOff size={14} /> : <Eye size={14} />}
</button>
</div>
<div className="flex gap-4 pt-2">
<button
onClick={handleExport}
@@ -660,12 +611,68 @@ const Settings: React.FC<SettingsProps> = ({ apiKey, productId, prompts, savePat
<input
type="file"
id="import-file-input"
accept=".conf"
accept=".json"
className="hidden"
onChange={handleFileSelect}
/>
</div>
</div>
{/* Daily Automated Backup */}
<div className="space-y-4 border-t border-border pt-6">
<h3 className="text-foreground font-semibold border-b border-border pb-2">Daily Automated Backup</h3>
<p className="text-xs text-muted-foreground">
Automatically backup your configuration once per day to prevent data loss. Backups are saved as JSON files.
</p>
<div className="flex items-center gap-2">
<input
type="checkbox"
id="enable-daily-backup"
checked={localDailyBackupEnabled}
onChange={(e) => setLocalDailyBackupEnabled(e.target.checked)}
className="w-4 h-4"
/>
<label htmlFor="enable-daily-backup" className="text-sm text-foreground cursor-pointer">
Enable daily automated backup
</label>
</div>
{localDailyBackupEnabled && (
<div>
<label className="block text-xs font-semibold text-muted-foreground mb-1 uppercase tracking-wide">
Backup Location
</label>
<div className="flex gap-2">
<input
type="text"
value={localDailyBackupPath}
onChange={(e) => setLocalDailyBackupPath(e.target.value)}
placeholder="Leave empty to use recordings folder"
className="flex-1 p-2 rounded border border-border bg-secondary text-foreground focus:ring-2 focus:ring-primary outline-none text-sm"
/>
<button
onClick={async () => {
try {
const selected = await open({ directory: true, multiple: false });
if (selected && typeof selected === 'string') {
setLocalDailyBackupPath(selected);
}
} catch (e) {
console.error(e);
}
}}
className="p-2 aspect-square flex items-center justify-center bg-secondary hover:bg-secondary/80 border border-border rounded text-foreground transition-all active:scale-95"
>
<FolderOpen size={16} />
</button>
</div>
<p className="text-[10px] text-muted-foreground mt-1">
Last backup: {props.lastBackupDate || 'Never'}
</p>
</div>
)}
</div>
</div>
)}