Audio Journal – Complete User Guide

🎙️ BisonNotes AI – Complete User Guide

Welcome to BisonNotes AI! This comprehensive guide will walk you through every aspect of using the app, from basic recording to advanced AI configuration.

🆕 Recent Updates — Version 1.5

Mistral AI Transcription: New cloud transcription engine using Voxtral Mini with speaker diarization support ($0.003/min)
Share Extension: Import audio files directly from Voice Memos, Files, and other apps via the iOS share sheet
Combine Recordings: Merge two separate recordings into a single continuous audio file
PDF Export Redesign: Professional three-pane header with metadata, local map, and regional map views; pagination with page numbers; dedicated tasks and reminders sections
On-Device AI Default: On-Device AI is now the default for new installs on supported devices (6GB+ RAM)
Updated AI Models: Gemini 3 Pro/Flash Preview, Mistral Large 25.12, Claude Sonnet 4.5 (auto-migrated from Sonnet 4)
Share Extension Wake-Up: Darwin notification integration ensures the main app imports shared files immediately, even when backgrounded

📋 Table of Contents

📱 Getting Started
🎙️ Recording Features
🤖 AI Engine Configuration
📝 Transcription Configuration
📊 Working with Summaries
- Search and Filtering
🎵 Audio Playback
⚙️ Settings & Configuration
- Simple Settings vs Advanced Settings
🔧 Troubleshooting
📱 Advanced Features
🎯 Best Practices
🔗 External Resources

📱 Getting Started

First Launch Setup

Install the App: Download BisonNotes AI from the App Store
Simple Settings Welcome Screen: Upon first launch, you’ll see a streamlined setup screen with three main options:
🎯 Initial Setup Options
- OpenAI (Cloud):
  - Cloud-based transcription and AI summaries
  - Requires OpenAI API key (enter during setup)
  - Most powerful and capable option
  - Pay-per-use pricing
  - Best for: High-quality results, advanced features
- On-Device AI:
  - Private, on-device AI processing
  - No data leaves your device
  - Best for recordings under 60 minutes
  - Requires download of AI summary models (2-3GB each) and On Device transcription model (150-520MB)
  - May be less accurate than cloud services
  - After selecting, you’ll configure On Device transcription and download AI models
  - Device requirements:
    - Transcription: iOS 17.0+, 4GB+ RAM (most modern devices)
    - AI Summary: iPhone 15 Pro or iPhone 16+, iOS 18.1+
- Advanced & Other Options:
  - Skip initial setup and configure later
  - Access to all available engines:
    - OpenAI Compatible – Use LiteLLM, llama.cpp, or similar proxies
    - Google AI Studio – Advanced Gemini AI processing
    - AWS Bedrock – Enterprise-grade Claude AI
    - Mistral AI – Advanced AI processing with Mistral models
  - Selecting this option immediately opens the advanced settings page
💡 Tip: The simple settings page automatically detects your current configuration. If you’ve configured something in advanced settings that doesn’t match the simple options, it will automatically show “Advanced & Other Options”.
Location Permission: The app will ask for location access if you enabled location tracking:
- “Allow While Using App”: Recommended – captures location during recording
- “Don’t Allow”: You can still manually add locations later
Automatic Migration: On first launch, the app will automatically scan for any existing audio files and migrate them into the database

Your First Recording

Start Recording: Tap the large microphone button on the main screen
Microphone Permission: On your first recording, iOS will ask for microphone access:
- Tap “OK”: Required for the app to function
- If denied, you can re-enable it in Settings → Privacy & Security → Microphone
Recording Status: You’ll see:
- Red recording indicator
- Live timer showing duration
- Location indicator (if location services enabled)
Stop Recording: Tap the stop button to end recording
Background Recording: The app continues recording even when minimized or phone is locked

Generate Your First Transcript

Access Recording: After stopping your recording, you’ll see it in the main recordings list
Start Transcription:
- Tap on your recording to open the detail view
- Tap the “Generate Transcript” button
- The app will process your audio using your selected AI engine
Transcription Progress: You’ll see a progress indicator showing:
- Processing status
- Time remaining estimate
- You can continue using the app while it processes in the background
View Results: Once complete, you’ll see the full transcript with:
- Editable text
- Time stamps (if supported by your AI engine)
- Confidence indicators

Generate Your First Summary

Prerequisites: You must have a transcript before generating a summary
Access Summary Options: In the recording detail view, tap “Generate Summary”
AI Processing: The app will analyze your transcript and create:
- Enhanced Summary: Main content overview
- Action Items: Extracted tasks with priority levels
- Reminders: Time-sensitive items with urgency indicators
- Alternative Titles: AI-generated recording names
Review Results: The summary view shows:
- Expandable sections for each content type
- Visual priority and confidence indicators
- Interactive maps (if location data available)
- Integration options for tasks and reminders

iCloud Sync Setup

🔄 When Does This Appear? After generating your first successful summary, the app will prompt you about iCloud syncing.

iCloud Prompt: You’ll see a dialog asking about iCloud sync for summaries
Choose Your Option:
- “Enable iCloud Sync”: Summaries sync across all your devices
  - Requires iCloud to be enabled on your device
  - Uses your iCloud storage quota
  - Provides backup and cross-device access
- “Keep Local Only”: Summaries stay on this device only
  - No cloud storage used
  - Better for privacy-sensitive content
  - Can be changed later in Settings
Configuration: If you enable iCloud, the app will automatically configure CloudKit sync

Managing and Deleting Recordings

Access Recording Options:
- Long press on any recording in the main list
- Or tap the recording and look for the “…” menu
Deletion Options: When you tap “Delete”, you’ll see comprehensive options:
🗑️ What Gets Deleted?
- Audio File Only: Keeps transcript and summary, removes audio
  - Best for: Saving storage while keeping the processed content
  - Note: You can’t regenerate transcript or play audio after this
- Everything: Removes audio file, transcript, and summary
  - Complete removal from device and iCloud (if syncing)
  - Cannot be undone
- Summary Only: Keeps audio and transcript, removes AI-generated summary
  - Useful if you want to regenerate summary with different AI engine
  - Can regenerate summary anytime
⚠️ Important: Deletion is permanent. Make sure you have backups if needed.
Confirmation: The app will ask you to confirm the deletion to prevent accidents
Background Cleanup: After deletion, the app automatically:
- Removes files from device storage
- Updates iCloud sync (if enabled)
- Cleans up any orphaned data
- Updates the recordings list

🎙️ Recording Features

iPhone Action Button Integration

📱 Available On: iPhone 15 Pro, iPhone 15 Pro Max, iPhone 16 Pro, iPhone 16 Pro Max, and future iPhone Pro models with Action Button

BisonNotes AI supports the iPhone Action Button, allowing you to quickly start recording without opening the app first. This is perfect for capturing thoughts, meetings, or voice notes instantly.

How to Configure the Action Button

Open Settings: On your iPhone, open the Settings app
Navigate to Action Button: Scroll down and tap “Action Button”
Select Shortcut: Choose “Shortcut” as the Action Button function
Choose BisonNotes AI:
- Tap “Choose a Shortcut”
- Search for “Start Recording” or “BisonNotes AI”
- Select “Start Recording” from BisonNotes AI
Done: Press the Action Button to test – it should launch BisonNotes AI and start recording automatically!

What Happens When You Press the Action Button

App Opens: BisonNotes AI launches automatically (even if the app was closed)
Switches to Recordings Tab: The app automatically navigates to the main recording screen
Recording Starts Immediately: Recording begins automatically without needing to tap the microphone button
Background Recording: Once started, recording continues even if you switch to another app or lock your phone

💡 Pro Tip: The Action Button works even when your phone is locked! Press the Action Button, and BisonNotes AI will launch and start recording. This makes it perfect for quick voice notes without unlocking your phone.

Location Tracking

Automatic: GPS location is captured with each recording
Manual: Add or edit location later in summary view
Privacy: Location tracking can be disabled in settings

Import Existing Audio

Tap “Import Audio Files” on the main screen
Select audio files from your device
Files are automatically added to your recordings library

📎 Share from Other Apps: You can import audio files directly from Voice Memos, Files, and other apps using the iOS share sheet — no need to export and re-import manually.

Open the Source App: Open Voice Memos, Files, or any app containing the audio file you want to import
Tap Share: Use the standard iOS share button and select “BisonNotes AI” from the share sheet
Automatic Import: The file is saved to a shared container and BisonNotes AI opens automatically to import it
Background Import: If BisonNotes AI is already running in the background, it will detect the new file immediately via a Darwin notification and import it without you needing to switch apps

📋 Supported File Types

Audio: M4A, MP3, WAV, CAF, AIFF, AIF
Documents: TXT, MD, PDF, DOC, DOCX

Combining Recordings

🔗 When to Combine: Use this feature to merge two separate recordings into one continuous file. This is especially useful if your recording was interrupted (e.g., microphone disconnected) and you want to create a single combined recording.

Access Recordings List:
- Go to the “Recordings” tab
- Tap “Select” button in the top right
Select Two Recordings:
- Tap the checkbox next to the first recording you want to combine
- Tap the checkbox next to the second recording
- You can only select two recordings at a time
Combine Recordings:
- Once two recordings are selected, a “Combine” button appears
- Tap “Combine” to open the combination interface
Choose Recording Order:
📋 Order Selection
- The app automatically recommends which recording should be first based on recording dates
- You’ll see a “Recommended” badge on the suggested first recording
- Tap the “First” recording card to swap the order if needed
- The preview shows the total combined duration
Important Requirements:
⚠️ Before Combining

Recordings with transcripts or summaries cannot be combined. You must delete any existing transcripts and/or summaries from both recordings before combining them.
- If either recording has a transcript, you’ll see a message explaining which recording has a transcript
- If either recording has a summary, you’ll see a message explaining which recording has a summary
- Delete the transcripts/summaries from both recordings, then try combining again
Why? Transcripts and summaries are tied to specific audio files. When combining recordings, you’ll need to regenerate transcripts and summaries for the new combined file.
Complete the Combination:
- Review the combined duration preview
- Tap “Combine Recordings” to merge the files
- The app will create a new combined recording file
- The original recordings remain unchanged
After Combining:
- The new combined recording appears in your recordings list
- You can generate a new transcript for the combined recording
- You can generate a new summary for the combined recording
- The original two recordings remain available if you need them

💡 Tips for Combining Recordings

Best Use Case: Combining recordings that were split due to microphone disconnection or app interruption
Order Matters: Make sure the recordings are in the correct chronological order
Storage: The combined file will be the sum of both original file sizes
Processing: After combining, you’ll need to generate new transcripts and summaries for the combined recording

🤖 AI Engine Configuration

Overview

BisonNotes AI supports multiple AI engines for transcription and summarization. Each has different capabilities and requirements.

1. On-Device AI

On-device Free

Type: On-device AI processing using local LLM models
Cost: Free
Privacy: 100% local
Internet: Required only for initial model download
Requirements:

Transcription: iOS 17.0+, 4GB+ RAM (most modern iPhones and iPads)
AI Summary: iPhone 15 Pro, iPhone 16 or newer, iOS 18.1+
Storage: 2-3GB for AI models, 150-520MB for transcription model

Setup:

Select “On-Device AI” in simple settings
Download On Device transcription model (Higher Quality or Faster Processing)
Download one or more AI summary models
Uses On Device transcription for transcription (requires model download)
Uses downloaded LLM models for AI summaries

Available Models:

Recommended Models (by device RAM):
- 8GB+ RAM: Granite 4.0 H Tiny (4.3 GB) – Recommended for best quality
- 6GB+ RAM: Granite 4.0 Micro (2.1 GB) – Recommended for fast processing
- 6GB+ RAM: Gemma 3n E2B (3.0 GB) – Good quality, smaller size
- 8GB+ RAM: Gemma 3n E4B (4.5 GB) – Best overall quality
- 6GB+ RAM: Ministral 3B (2.1 GB) – Best for tasks and reminders
Experimental Models (enable in settings):
- 4GB+ RAM: LFM 2.5 1.2B (731 MB) – Fast, minimal summaries (summary only)
- 4GB+ RAM: Qwen3 1.7B (1.1 GB) – Latest Qwen3 model (summary only)
- 8GB+ RAM: Qwen3 4B (2.7 GB) – Excellent detail extraction

Best for: Privacy-conscious users, offline use, recordings under 60 minutes

Limitations:

Best for recordings under 60 minutes
Requires 2-3GB storage per model
May be less accurate than cloud services

2. OpenAI Integration

Cloud-based Pay-per-use

Type: Cloud-based AI
Cost: Pay-per-use (very affordable)
Privacy: Data sent to OpenAI
Internet: Required

Best for: High-quality results, advanced features

3. Google AI Studio

Cloud-based Free tier

Type: Cloud-based AI
Cost: Free tier available, then pay-per-use
Privacy: Data sent to Google
Internet: Required

Available Models:

Gemini 2.5 Flash (Default): Fast and efficient for most tasks
Gemini 2.5 Flash Lite: Lightweight variant for quick processing
Gemini 3 Pro Preview: Advanced reasoning and analysis (Preview)
Gemini 3 Flash Preview: Fast next-generation model (Preview)

Best for: Balanced performance and cost, with free tier for getting started

4. OpenAI API Compatible

Cloud/Local Flexible

Type: OpenAI-compatible API endpoint
Cost: Varies by provider
Privacy: Depends on provider
Internet: Required (unless local server)

Supported Providers: This single engine option works with multiple providers:

LiteLLM: Open-source proxy for multiple providers
llama.cpp: High-performance local LLM inference server
Nebius: Cloud provider with OpenAI-compatible API
Groq: Fast inference with OpenAI-compatible API
Custom Servers: Any OpenAI-compatible endpoint

Setup:

Go to Settings → AI Settings → OpenAI API Compatible
Enter your API key (from your chosen provider, or use “no-key” for local servers)
Set base URL to your provider’s endpoint:
- Groq: https://api.groq.com/v1
- Nebius: Your Nebius endpoint URL
- LiteLLM: Your LiteLLM server URL
- llama.cpp: Your llama.cpp server URL (default: http://localhost:8080)
Select model (use your provider’s model name)
Test the connection

Best for: Using LiteLLM, llama.cpp, Nebius, Groq, or other OpenAI-compatible services

5. Mistral AI

Cloud-based Pay-per-use

Type: Cloud-based AI processing and transcription
Cost: Pay-per-use (summarization varies by model; transcription $0.003/min)
Privacy: Data sent to Mistral AI
Internet: Required

Summarization Models:

Mistral Large (25.12): Most capable model, 128K context window (Premium tier)
Mistral Medium (25.08): Balanced performance and cost, 128K context (Standard tier)
Magistral Medium (25.09): Economy option, 40K context window (Economy tier)

Transcription:

Voxtral Mini Transcribe: Speech-to-text at $0.003/min with optional speaker diarization
Supports MP3, MP4, M4A, WAV, FLAC, OGG, WebM
Automatic language detection or explicit language code

Setup:

Go to Settings → AI Settings → Mistral AI
Enter your Mistral API key
Select summarization model (Large, Medium, or Magistral)
For transcription: go to Settings → Transcription Settings and select “Mistral AI”
Test the connection

Best for: Fast, high-quality summaries and affordable cloud transcription with speaker diarization

6. AWS Bedrock

Cloud-based Pay-per-use

Type: Cloud-based AI
Cost: Pay-per-use
Privacy: Data sent to AWS
Internet: Required

Best for: Enterprise features

7. Ollama

Local AI Free

Type: Local LLM server
Cost: Free (requires your own server)
Privacy: 100% local
Internet: Not required for processing

Setup:

Install Ollama server on your local machine or network
Go to Settings → AI Settings → Ollama
Set server URL (default: http://localhost)
Set port (default: 11434)
Select model (llama2:7b, qwen3:30b, etc.)
Test the connection

Best for: Privacy, customizable models, offline use

Setup Instructions for Each Engine

OpenAI Integration

Get API Key: Visit platform.openai.com
Create Account: Sign up for an OpenAI account
Generate API Key: Go to API Keys section and create a new key
Configure in App:
- Go to Settings → AI Settings → OpenAI
- Enter your API key
- Select your preferred model
- Test the connection

Available OpenAI Models

Model	Type	Best For	Tier
GPT-4.1	Summarization	Most robust and comprehensive analysis with advanced reasoning capabilities	Premium
GPT-4.1 Mini	Summarization	Balanced performance and cost, suitable for most summarization tasks (Default)	Standard
GPT-4.1 Nano	Summarization	Fastest and most economical for basic summarization needs	Economy
GPT-5 Mini	Summarization	Next-generation model with enhanced reasoning and efficiency	Premium

OpenAI Transcription Models

Model	Type	Best For
GPT-4o Transcribe	Transcription	Most robust transcription with GPT-4o model. Supports streaming.
GPT-4o Mini Transcribe	Transcription	Cheapest and fastest transcription with GPT-4o Mini model. Supports streaming. Recommended for most use cases.
Whisper-1	Transcription	Legacy transcription with Whisper V2 model. Does not support streaming.

Google AI Studio Integration

Get API Key: Visit aistudio.google.com
Create Account: Sign up for Google AI Studio
Generate API Key: Create a new API key
Configure in App:
- Go to Settings → AI Settings → Google AI Studio
- Enter your API key
- Select model:
  - Gemini 2.5 Flash (Default): Fast and efficient for most tasks
  - Gemini 2.5 Flash Lite: Lightweight variant for quick processing
  - Gemini 3 Pro Preview: Advanced reasoning and analysis (Preview)
  - Gemini 3 Flash Preview: Fast next-generation model (Preview)
- Test the connection

OpenAI API Compatible Integration

📌 Important Note

OpenAI API Compatible is a single engine option that works with multiple providers. You don’t select Nebius, Groq, LiteLLM, or vLLM as separate engines – instead, you configure the “OpenAI API Compatible” option with your chosen provider’s settings.

Choose Your Provider: Select one of these OpenAI-compatible services:
- LiteLLM: Open-source proxy for multiple providers
  - Base URL: Your LiteLLM server URL (e.g., http://localhost:4000/v1)
  - Get API key from your LiteLLM configuration
  - Documentation: github.com/BerriAI/litellm
- llama.cpp: High-performance local LLM inference server
  - Base URL: Your llama.cpp server URL (default: http://localhost:8080)
  - API key: Use “no-key” or leave empty for local servers
  - Installation: Clone from github.com/ggerganov/llama.cpp, build with make, then run ./server --model <model_file>
  - Documentation: See llama.cpp README for full setup instructions
  - Alternative: Use llama-cpp-python with pip install 'llama-cpp-python[server]' and run python3 -m llama_cpp.server --model /path/to/model.gguf
- Nebius: Cloud provider with OpenAI-compatible API
  - Base URL: Your Nebius endpoint URL
  - Get API key from Nebius console
- Groq: Fast inference with OpenAI-compatible API
  - Base URL: https://api.groq.com/v1
  - Get API key from console.groq.com
- Custom Server: Your own OpenAI-compatible endpoint
  - Base URL: Your server’s endpoint URL
  - API key: As configured on your server
Get API Key: Obtain an API key from your chosen provider (if required)
Configure in App:
- Go to Settings → AI Settings → OpenAI API Compatible
- Enter your API key (from your chosen provider)
- Set base URL to your provider’s endpoint (see examples above)
- Select model (use your provider’s exact model name, e.g., llama-3.1-70b-versatile for Groq)
- Configure temperature and max tokens
- Test the connection

💡 OpenAI Compatible Tips

Single Engine Option: All providers (Nebius, Groq, LiteLLM, llama.cpp) use the same “OpenAI API Compatible” engine option – just change the base URL and API key
Base URL Examples:
- Groq: https://api.groq.com/v1
- Local llama.cpp: http://localhost:8080 (default port)
- Local LiteLLM: http://localhost:4000/v1
- Nebius: Your Nebius endpoint URL
Model Names: Use the exact model name your provider supports (e.g., Groq uses names like llama-3.1-70b-versatile, not gpt-4o)
Local Servers: For local servers (llama.cpp, LiteLLM), ensure your device can reach the server IP address
llama.cpp Setup:
- Native server: Clone llama.cpp, build with make, run ./server --model <model.gguf>
- Python server: Install with pip install 'llama-cpp-python[server]', run python3 -m llama_cpp.server --model /path/to/model.gguf
- Default port is 8080, API key can be “no-key” or empty
Transcription: OpenAI API Compatible is also available for transcription in Settings → Transcription Settings

Mistral AI Integration

Get API Key: Visit console.mistral.ai
Create Account: Sign up for a Mistral AI account
Generate API Key: Create a new API key in your account settings
Configure Summarization:
- Go to Settings → AI Settings → Mistral AI
- Enter your API key
- Select summarization model:
  - Mistral Large (25.12): Most capable, 128K context (Premium)
  - Mistral Medium (25.08): Balanced performance, 128K context (Standard)
  - Magistral Medium (25.09): Economy option, 40K context
- Configure temperature and other parameters
- Test the connection
Configure Transcription (Optional):
- Go to Settings → Transcription Settings
- Select “Mistral AI” as your transcription engine
- Tap “Configure” to open Mistral transcription settings
- The same API key is shared between summarization and transcription
- Enable speaker diarization if you want speaker labels
- Optionally set a language code for better accuracy

💡 Mistral Transcription Tips

Cost: Voxtral Mini transcription costs $0.003 per minute of audio
Diarization: When enabled, the transcription identifies and labels different speakers
Language: Leave empty for auto-detection, or specify a code (e.g., “en”, “fr”, “es”) for better accuracy
Supported Formats: MP3, MP4, M4A, WAV, FLAC, OGG, WebM
Large Files: Files over 24MB are automatically chunked for processing

Ollama Integration

Install Ollama: Visit ollama.com and install Ollama on your local machine or server

Download Recommended Models:

ollama pull qwen3:30b
ollama pull gpt-oss:20b
ollama pull mistral-small3.2

Configure in App:
- Go to Settings → AI Settings → Ollama
- Set server URL (default: http://localhost for local, or your server’s IP address)
- Set port (default: 11434)
- Run a Model Scan: Tap the refresh button to scan your Ollama server for available models
- Select your preferred model:
  - Recommended: qwen3:30b, gpt-oss:20b, mistral-small3.2
  - Available models will be fetched from your Ollama server
- Configure temperature and max tokens
- Test the connection

💡 Ollama Tips

Local Server: Use http://localhost:11434 if Ollama is running on the same device
Network Server: Use your server’s IP address (e.g., http://192.168.1.100:11434)
Model Scan: Always run a model scan after connecting to fetch the list of models installed on your Ollama server
Performance: Larger models provide better results but require more RAM and processing time

AWS Bedrock Integration

AWS Account: Create an AWS account
Enable Bedrock: Enable AWS Bedrock service
Create IAM User: Create user with Bedrock permissions
Get Credentials: Generate access keys
Configure in App:
- Go to Settings → AI Settings → AWS Bedrock
- Enter AWS credentials
- Select region
- Choose foundation model:
  - Claude 4.5 Haiku (Default): Fast and efficient model optimized for quick responses (Standard tier)
  - Claude Sonnet 4.5: Latest Claude Sonnet with advanced reasoning, coding, and analysis capabilities (Premium tier)
  - Llama 4 Maverick 17B Instruct: Meta’s latest Llama 4 model with enhanced reasoning and performance (Economy tier)
- Test the connection

Whisper Integration (Local Server)

Install Whisper Server:

# Using Docker (recommended)
docker run -d -p 9000:9000 \
  -e ASR_MODEL=base \
  -e ASR_ENGINE=openai_whisper \
  onerahmet/openai-whisper-asr-webservice:latest

Configure in App:
- Go to Settings → Transcription Settings → Whisper (Local Server)
- Set server URL (e.g., http://localhost or http://192.168.1.100)
- Set port (default: 9000)
- Select protocol (REST API or Wyoming)
- Select model size (tiny, base, small, medium, large-v3)
- Test the connection

💡 Whisper Protocol Options

REST API: Traditional HTTP REST API with file uploads
Wyoming: Modern streaming protocol with WebSocket connection

AWS Transcribe Integration

AWS Account: Create an AWS account
Enable Transcribe: Enable AWS Transcribe service in your AWS console
Create IAM User: Create user with Transcribe permissions
Get Credentials: Generate access keys
Configure in App:
- Go to Settings → Transcription Settings → AWS Transcribe
- Enter AWS access key ID
- Enter AWS secret access key
- Select region (e.g., us-east-1, eu-west-1)
- Choose language
- Test the connection

💡 AWS Transcribe Tips

IAM Permissions: Ensure your IAM user has transcribe:StartTranscriptionJob and transcribe:GetTranscriptionJob permissions
Regions: Choose a region close to you for better performance
Cost: AWS Transcribe charges per minute of audio transcribed

On-Device AI Setup

Select On-Device AI: Choose “On-Device AI” in the simple settings page
Download Transcription Model:
- You’ll be prompted to configure On Device transcription
- Go to Settings → Transcription Settings → On Device
- Download either “Higher Quality” (~520MB) or “Faster Processing” (~150MB) model
- See On Device Transcription Setup section above for detailed model selection guide
Download AI Summary Models: After saving, you’ll be taken to the On-Device AI settings page:
- Select one or more AI summary models to download
- Available models:
  - Recommended Models (by device RAM):
    - 8GB+ RAM: Granite 4.0 H Tiny (4.3 GB) – Recommended for best quality
    - 6GB+ RAM: Granite 4.0 Micro (2.1 GB) – Recommended for fast processing
    - 6GB+ RAM: Gemma 3n E2B (3.0 GB) – Good quality, smaller size
    - 8GB+ RAM: Gemma 3n E4B (4.5 GB) – Best overall quality
    - 6GB+ RAM: Ministral 3B (2.1 GB) – Best for tasks and reminders
  - Experimental Models (enable in settings):
    - 4GB+ RAM: LFM 2.5 1.2B (731 MB) – Fast, minimal summaries (summary only)
    - 4GB+ RAM: Qwen3 1.7B (1.1 GB) – Latest Qwen3 model (summary only)
    - 8GB+ RAM: Qwen3 4B (2.7 GB) – Excellent detail extraction
- Download progress is shown with speed and time remaining
Model Selection: Choose which model to use for summaries
Configuration: Adjust generation settings (temperature, max tokens, etc.)
Transcription: Uses On Device transcription (requires model download as described above)

💾 Storage Requirements

On-Device AI requires 2-3GB of storage per AI summary model, plus 150-520MB for the On Device transcription model. Make sure you have sufficient free space before downloading models.

📱 Device Requirements

Transcription: iOS 17.0+, 4GB+ RAM (most modern iPhones and iPads)
AI Summary: iPhone 15 Pro, iPhone 16 or newer, iOS 18.1+ (requires more processing power)

📝 Transcription Configuration

🔒 On Device Transcription

For complete privacy, use On Device transcription. Your audio is processed entirely on your device and never leaves your iPhone or iPad. See the On Device Transcription Setup section below for complete setup instructions.

Engine Selection

Go to Settings → Transcription Settings
Select your preferred transcription engine
Configure the selected engine (if required)
Test the connection

Available Transcription Engines

On Device: High-quality on-device transcription. Your audio never leaves your device, ensuring complete privacy. Requires model download (~150-520MB). Best for privacy-conscious users.
OpenAI: Cloud-based transcription using OpenAI’s GPT-4o models and Whisper API. Supports multiple models:
- GPT-4o Transcribe: Most robust transcription with GPT-4o model. Supports streaming for real-time transcription.
- GPT-4o Mini Transcribe: Cheapest and fastest transcription with GPT-4o Mini model. Supports streaming. Recommended for most use cases.
- Whisper-1: Legacy transcription with Whisper V2 model. Does not support streaming.
Mistral AI: Cloud-based transcription using Mistral’s Voxtral Mini model ($0.003/min). Features:
- Speaker Diarization: Optional — identifies and labels different speakers in the audio
- Language Support: Automatic detection or explicit language code for better accuracy
- Supported Formats: MP3, MP4, M4A, WAV, FLAC, OGG, WebM
- Large Files: Automatic chunking for files over 24MB
Whisper (Local Server): High-quality transcription using OpenAI’s Whisper model on your local server (REST API or Wyoming protocol)
AWS Transcribe: Cloud-based transcription service with support for long audio files

On Device Transcription Setup

🔒 Privacy-First Transcription

On Device transcription processes your audio entirely on your device. Your audio files never leave your iPhone or iPad, ensuring complete privacy and security.

Initial Setup

Enable On Device:
- Go to Settings → Transcription Settings
- Select “On Device” as your transcription engine
- Tap “Configure” to open On Device settings
Download a Model:
- In On Device settings, you’ll see two model options:
  - Higher Quality (~520MB): Best accuracy and quality. Takes longer to process but produces more accurate transcriptions.
  - Faster Processing (~150MB): Faster transcription with good quality. Ideal for quick transcriptions with slightly lower accuracy.
- Tap the download button next to your preferred model
- Wait for download to complete (progress is shown with speed and time remaining)
- Model download requires internet connection, but transcription works offline after download
Device Requirements:
- RAM: Requires 4GB+ RAM (most modern iPhones and iPads)
- iOS: iOS 17.0 or later
- Storage: 150-520MB free space depending on model selected
Test the Model:
- Once downloaded, tap “Test Model Loading” to verify it works
- First load may take 30-60 seconds as the model compiles
- If you see errors, try deleting and re-downloading the model

Model Selection Guide

📋 Which Model Should I Use?

Use Case	Recommended Model	Why?
Voice Notes / Journaling	Faster Processing	You are close to the mic; audio is clear. Speed is better.
Meeting / Interview	Higher Quality	Handling multiple voices and distance from the mic requires the extra accuracy.
Noisy Environment	Higher Quality	Faster Processing will fail to separate voice from noise.
Long Battery Life Needed	Faster Processing	Higher Quality uses significantly more power per second of audio.

Using On Device Transcription

Record Audio: Create a recording as usual
Generate Transcript:
- Open your recording
- Tap “Generate Transcript”
- On Device transcription will process your audio locally
Processing Time:
- Faster Processing: Approximately 3x faster than Higher Quality
- Higher Quality: Takes longer but produces more accurate results
- Processing time depends on audio length and device performance
View Results: Once complete, your transcript appears with full text and timestamps

✅ Benefits of On Device Transcription

Complete Privacy: Your audio never leaves your device
Works Offline: No internet required after model download
No API Costs: Free to use after initial model download
Fast Processing: Especially with “Faster Processing” model
High Quality: Excellent accuracy with “Higher Quality” model

⚠️ Important Notes

Model Download: Requires internet connection and sufficient storage space
First Load: First transcription after download may take 30-60 seconds as the model compiles
Device Compatibility: Some older devices may not support On Device transcription
Battery Usage: Higher Quality model uses more battery than Faster Processing
Storage: Models take 150-520MB of storage space

Large File Processing

Automatic Chunking: Files over 5 minutes are automatically split
Progress Tracking: Real-time progress updates
Background Processing: Continues when app is minimized
Timeout Settings: Configurable processing time limits

📊 Working with Summaries

Viewing Summaries

Tap the “Summaries” tab
Browse your recordings with AI-generated summaries
Tap any summary to view details

Summary Features

Expandable Sections: Tap to expand/collapse sections
Task Extraction: AI-identified actionable items
Reminder Detection: Time-sensitive reminders
Priority Indicators: Color-coded task priorities
Location Maps: Interactive maps showing recording location

Search and Filtering

BisonNotes AI includes powerful search and filtering capabilities to help you find your recordings, transcripts, and summaries quickly.

Search Functionality

Search is available in three main views:

Summaries View: Search across summary content, tasks, reminders, titles, and recording names
Transcripts View: Search through transcript text and recording names
Recordings View: Search by recording name

How to use:

Tap the search bar at the top of any view
Type your search terms
Results filter in real-time as you type
Search is case-insensitive and matches partial text

Date Filters

Date range filtering helps you find content from specific time periods:

Available in: Summaries, Transcripts, and Recordings views
How to use:
1. Tap the filter icon (three horizontal lines with circle) in the navigation bar
2. Select a start date and end date
3. Tap “Apply” to filter results
4. The active filter is shown with a banner at the top of the list
5. Tap the X on the banner to clear the filter

💡 Filter Behavior

Filters can be combined with search for precise results
Date range includes the full day (00:00:00 to 23:59:59) for both start and end dates
Filters persist until manually cleared

Editing Recording Metadata

Changing Recording Title

Open a summary
Scroll to “Titles” section
Tap “Edit” next to any title
Enter new title or select from AI-generated alternatives
Tap “Use This Title”

Setting Custom Date & Time

Open a summary
Scroll to “Recording Date & Time” section
Tap “Set Custom Date & Time”
Use date and time pickers
Tap “Save”

Adding/Editing Location

Open a summary
In the location section, tap “Add Location” or “Edit Location”
Choose from:
- Current Location: Use device GPS
- Map Selection: Pick location on map
- Manual Entry: Enter coordinates manually
Tap “Save”

🎵 Audio Playback

Basic Playback

Go to “Recordings” tab
Tap any recording to play
Use playback controls:
- Play/Pause: Center button
- Skip 15s: Side buttons
- Scrub: Drag progress bar

Advanced Playback

Seek Control: Drag the scrubber for precise positioning
Background Playback: Audio continues when app is minimized
Audio Session Management: Handles interruptions gracefully

⚙️ Settings & Configuration

Simple Settings vs Advanced Settings

🎯 Understanding the Two Settings Interfaces

BisonNotes AI has two settings interfaces designed for different use cases:

Simple Settings Page

The simple settings page appears on first launch and provides quick setup for the most common configurations:

Automatic Detection: The page automatically detects your current configuration:
- If both AI engine and transcription are set to “OpenAI” → Shows “OpenAI” option
- If AI engine is “On-Device AI” and transcription is “On Device” → Shows “On-Device AI” option
- Any other configuration → Automatically shows “Advanced & Other Options”
Preserves Settings: If you manually switch to “Advanced & Other Options”, your current settings are preserved (not reset to blank)
Quick Access: Tap “Advanced Options” button to access full settings at any time
Immediate Action:
- Selecting “OpenAI” → Enter API key and save
- Selecting “On-Device AI” → Saves and immediately opens On-Device AI settings to download models
- Selecting “Advanced & Other Options” → Saves and immediately opens advanced settings page

Advanced Settings Page

The advanced settings page provides full control over all configuration options:

AI Processing: Configure AI summarization engines
- On-Device AI
- OpenAI
- Google AI Studio
- AWS Bedrock
- Mistral AI
- OpenAI API Compatible
- Ollama
Transcription Engine: Configure transcription engines
- On Device
- OpenAI
- Whisper (Local Server)
- AWS Transcribe
- OpenAI API Compatible
Microphone Selection: Choose from available microphones (appears below AI and Transcription sections)
Preferences: Display preferences, time format, etc.
Advanced Settings: Location services, iCloud sync, background processing

🔄 Switching Between Simple and Advanced

From Simple to Advanced: Tap “Advanced Options” button in the top right
From Advanced to Simple: The simple settings page automatically detects your configuration when you return to it
Automatic Updates: If you change settings in advanced options, the simple settings page will automatically switch to “Advanced & Other Options” when you return

Audio Settings

Quality: Whisper Optimized (22 kHz, 64 kbps AAC) – Optimized for voice transcription
Microphone Selection:
- Choose from available microphones (built-in, Bluetooth, USB devices)
- Your selection is saved and used for all recordings
- If selected microphone becomes unavailable, automatically falls back to iOS default
- During recording, if microphone disconnects, recording continues with default microphone
Mixed Audio: Record without interrupting system audio
Background Recording: Continue recording when app is minimized

AI Settings

Engine Selection: Choose your preferred AI engine for summaries:
- On-Device AI: Privacy-focused local AI processing
- OpenAI: Cloud-based AI with GPT models
- Google AI Studio: Gemini AI processing
- AWS Bedrock: Enterprise-grade Claude AI
- Mistral AI: Advanced AI processing with Mistral models
- OpenAI API Compatible: Connect to OpenAI-compatible endpoints (LiteLLM, llama.cpp, Nebius, Groq, etc.)
- Ollama: Local LLM server for privacy-focused processing
Model Configuration: Adjust settings for selected engine (temperature, max tokens, etc.)
Connection Testing: Verify API connectivity
Batch Regeneration: Update all summaries with new engine

Background Processing

Job Management: View active and completed jobs
Progress Tracking: Monitor long-running operations
Error Recovery: Automatic retry and error handling
Performance Monitoring: Real-time metrics

Data Management

Migration Tools: Import legacy data
Database Maintenance: Clear and repair data
File Relationships: Manage audio, transcript, and summary files
Debug Tools: Advanced troubleshooting options

🔧 Troubleshooting

Common Issues

Recording Problems

No Audio: Check microphone permissions
Poor Quality: Adjust audio quality settings
Background Recording: Enable in settings

AI Engine Issues

Connection Failed: Check internet and API keys
Timeout Errors: Increase timeout settings
Authentication Errors: Verify API credentials

Transcription Problems

No Transcription: Check engine configuration
Poor Quality: Try different engine or model
Large File Issues: Enable chunking for files over 5 minutes

Data Issues

Missing Recordings: Use Data Migration tools
Corrupted Data: Clear and re-import data
Sync Problems: Check iCloud settings

Performance Optimization

Battery Life: Use local engines for offline processing
Memory Usage: Close other apps during large file processing
Storage: Regularly clean up old recordings
Network: Use local engines to reduce data usage

📱 Advanced Features

Background Processing

Job Queue: Multiple operations run in background
Progress Tracking: Real-time updates for long operations
Error Recovery: Automatic retry for failed operations
Stale Job Cleanup: Automatic cleanup of abandoned jobs

File Management

Import/Export: Support for various audio formats (M4A, MP3, WAV, CAF, AIFF, AIF)
Share Extension: Import audio directly from Voice Memos, Files, and other apps via the iOS share sheet (see Share Extension section)
Combining Recordings: Merge two recordings into one continuous file (see Combining Recordings section)
PDF Export: Professional PDF reports with three-pane header (metadata, local map, regional map), page numbers, and dedicated tasks/reminders sections
File Relationships: Maintains connections between audio, transcripts, and summaries
Orphaned File Detection: Identifies and manages disconnected files
Selective Deletion: Choose what to keep when deleting recordings

Location Intelligence

GPS Integration: Automatic location capture
Reverse Geocoding: Converts coordinates to addresses
Smart Location Search: Advanced search with 3-tier fallback system
University Database: Built-in mapping for major universities
Search Retry Logic: Intelligent retry for failed searches
Interactive Maps: View recording locations
Manual Location: Add locations after recording
Performance Optimized: Background processing prevents UI blocking

Data Migration

Legacy Import: Migrate from old file-based storage
Data Integrity: Validate and repair data relationships
Batch Operations: Process multiple files at once
Progress Tracking: Monitor migration progress

🎯 Best Practices

Recording

Environment: Record in quiet environments for best quality
Distance: Keep microphone 6-12 inches from mouth
Duration: Break long recordings into segments
Background: Minimize background noise

AI Configuration

Privacy: Use local engines for sensitive content
Cost: Start with free engines, upgrade as needed
Quality: Experiment with different models for best results
Reliability: Have backup engines configured

Data Management

Regular Backups: Export important recordings
Cleanup: Remove old recordings periodically
Organization: Use descriptive titles for easy finding
Metadata: Add location and custom dates for context

Performance

Battery: Use local engines when battery is low
Storage: Monitor available space
Network: Use local engines when internet is slow
Memory: Close other apps during processing

🔗 External Resources

AI Service Documentation

OpenAI: platform.openai.com/docs
Google AI: ai.google.dev
AWS Bedrock: docs.aws.amazon.com/bedrock
AWS Transcribe: docs.aws.amazon.com/transcribe

Additional Resources

Mistral AI: docs.mistral.ai
Ollama: ollama.com
OpenAI API Compatible:
- LiteLLM: github.com/BerriAI/litellm
- llama.cpp: github.com/ggerganov/llama.cpp – Official repository with installation and server setup instructions
- llama-cpp-python: llama-cpp-python.readthedocs.io – Python bindings with server support
- Nebius: nebius.com
- Groq: groq.com
Whisper: github.com/ahmetoner/whisper-asr-webservice
AWS Transcribe: aws.amazon.com/transcribe/

Support

GitHub Issues: Report bugs and request features
Documentation: Check the README for technical details
Community: Join discussions and share tips

🎯 Ready to Get Started?

BisonNotes AI – Transform your spoken words into actionable intelligence with advanced AI processing and comprehensive data management.

Download the app and start recording your first BisonNotes today!

This documentation is regularly updated. For the latest information, check the app’s built-in help or visit our support resources.

🎙️ BisonNotes AI – Complete User Guide

🆕 Recent Updates — Version 1.5

📋 Table of Contents

📱 Getting Started

First Launch Setup

🎯 Initial Setup Options

Your First Recording

Generate Your First Transcript

Generate Your First Summary

iCloud Sync Setup

Managing and Deleting Recordings

🗑️ What Gets Deleted?

🎙️ Recording Features

iPhone Action Button Integration

How to Configure the Action Button

What Happens When You Press the Action Button

Location Tracking

Import Existing Audio

Import via Share Extension

📋 Supported File Types

Combining Recordings

📋 Order Selection

⚠️ Before Combining

💡 Tips for Combining Recordings

🤖 AI Engine Configuration

Overview

1. On-Device AI

2. OpenAI Integration

3. Google AI Studio

4. OpenAI API Compatible

5. Mistral AI

6. AWS Bedrock

7. Ollama

Setup Instructions for Each Engine

OpenAI Integration

Available OpenAI Models

OpenAI Transcription Models

Google AI Studio Integration

OpenAI API Compatible Integration

📌 Important Note

💡 OpenAI Compatible Tips

Mistral AI Integration

💡 Mistral Transcription Tips

Ollama Integration

💡 Ollama Tips

AWS Bedrock Integration

Whisper Integration (Local Server)

💡 Whisper Protocol Options

AWS Transcribe Integration

💡 AWS Transcribe Tips

On-Device AI Setup

💾 Storage Requirements

📱 Device Requirements

📝 Transcription Configuration

🔒 On Device Transcription

Engine Selection

Available Transcription Engines

On Device Transcription Setup

🔒 Privacy-First Transcription

Initial Setup

Model Selection Guide

📋 Which Model Should I Use?

Using On Device Transcription

✅ Benefits of On Device Transcription

⚠️ Important Notes

Large File Processing

📊 Working with Summaries

Viewing Summaries

Summary Features

Search and Filtering

Search Functionality

Date Filters

💡 Filter Behavior

Editing Recording Metadata

Changing Recording Title

Setting Custom Date & Time

Adding/Editing Location

🎵 Audio Playback

Basic Playback

Advanced Playback