🎙️ BisonNotes AI – Complete User Guide
Welcome to BisonNotes AI! This comprehensive guide will walk you through every aspect of using the app, from basic recording to advanced AI configuration.
🆕 Recent Updates — Version 1.5
- Mistral AI Transcription: New cloud transcription engine using Voxtral Mini with speaker diarization support ($0.003/min)
- Share Extension: Import audio files directly from Voice Memos, Files, and other apps via the iOS share sheet
- Combine Recordings: Merge two separate recordings into a single continuous audio file
- PDF Export Redesign: Professional three-pane header with metadata, local map, and regional map views; pagination with page numbers; dedicated tasks and reminders sections
- On-Device AI Default: On-Device AI is now the default for new installs on supported devices (6GB+ RAM)
- Updated AI Models: Gemini 3 Pro/Flash Preview, Mistral Large 25.12, Claude Sonnet 4.5 (auto-migrated from Sonnet 4)
- Share Extension Wake-Up: Darwin notification integration ensures the main app imports shared files immediately, even when backgrounded
📋 Table of Contents
📱 Getting Started
First Launch Setup
- Install the App: Download BisonNotes AI from the App Store
- Simple Settings Welcome Screen: Upon first launch, you’ll see a streamlined setup screen with three main options:
🎯 Initial Setup Options
- OpenAI (Cloud):
- Cloud-based transcription and AI summaries
- Requires OpenAI API key (enter during setup)
- Most powerful and capable option
- Pay-per-use pricing
- Best for: High-quality results, advanced features
- On-Device AI:
- Private, on-device AI processing
- No data leaves your device
- Best for recordings under 60 minutes
- Requires download of AI summary models (2-3GB each) and On Device transcription model (150-520MB)
- May be less accurate than cloud services
- After selecting, you’ll configure On Device transcription and download AI models
- Device requirements:
- Transcription: iOS 17.0+, 4GB+ RAM (most modern devices)
- AI Summary: iPhone 15 Pro or iPhone 16+, iOS 18.1+
- Advanced & Other Options:
- Skip initial setup and configure later
- Access to all available engines:
- OpenAI Compatible – Use LiteLLM, llama.cpp, or similar proxies
- Google AI Studio – Advanced Gemini AI processing
- AWS Bedrock – Enterprise-grade Claude AI
- Mistral AI – Advanced AI processing with Mistral models
- Selecting this option immediately opens the advanced settings page
💡 Tip: The simple settings page automatically detects your current configuration. If you’ve configured something in advanced settings that doesn’t match the simple options, it will automatically show “Advanced & Other Options”.
- OpenAI (Cloud):
- Location Permission: The app will ask for location access if you enabled location tracking:
- “Allow While Using App”: Recommended – captures location during recording
- “Don’t Allow”: You can still manually add locations later
- Automatic Migration: On first launch, the app will automatically scan for any existing audio files and migrate them into the database
Your First Recording
- Start Recording: Tap the large microphone button on the main screen
- Microphone Permission: On your first recording, iOS will ask for microphone access:
- Tap “OK”: Required for the app to function
- If denied, you can re-enable it in Settings → Privacy & Security → Microphone
- Recording Status: You’ll see:
- Red recording indicator
- Live timer showing duration
- Location indicator (if location services enabled)
- Stop Recording: Tap the stop button to end recording
- Background Recording: The app continues recording even when minimized or phone is locked
Generate Your First Transcript
- Access Recording: After stopping your recording, you’ll see it in the main recordings list
- Start Transcription:
- Tap on your recording to open the detail view
- Tap the “Generate Transcript” button
- The app will process your audio using your selected AI engine
- Transcription Progress: You’ll see a progress indicator showing:
- Processing status
- Time remaining estimate
- You can continue using the app while it processes in the background
- View Results: Once complete, you’ll see the full transcript with:
- Editable text
- Time stamps (if supported by your AI engine)
- Confidence indicators
Generate Your First Summary
- Prerequisites: You must have a transcript before generating a summary
- Access Summary Options: In the recording detail view, tap “Generate Summary”
- AI Processing: The app will analyze your transcript and create:
- Enhanced Summary: Main content overview
- Action Items: Extracted tasks with priority levels
- Reminders: Time-sensitive items with urgency indicators
- Alternative Titles: AI-generated recording names
- Review Results: The summary view shows:
- Expandable sections for each content type
- Visual priority and confidence indicators
- Interactive maps (if location data available)
- Integration options for tasks and reminders
iCloud Sync Setup
🔄 When Does This Appear? After generating your first successful summary, the app will prompt you about iCloud syncing.
- iCloud Prompt: You’ll see a dialog asking about iCloud sync for summaries
- Choose Your Option:
- “Enable iCloud Sync”: Summaries sync across all your devices
- Requires iCloud to be enabled on your device
- Uses your iCloud storage quota
- Provides backup and cross-device access
- “Keep Local Only”: Summaries stay on this device only
- No cloud storage used
- Better for privacy-sensitive content
- Can be changed later in Settings
- “Enable iCloud Sync”: Summaries sync across all your devices
- Configuration: If you enable iCloud, the app will automatically configure CloudKit sync
Managing and Deleting Recordings
- Access Recording Options:
- Long press on any recording in the main list
- Or tap the recording and look for the “…” menu
- Deletion Options: When you tap “Delete”, you’ll see comprehensive options:
🗑️ What Gets Deleted?
- Audio File Only: Keeps transcript and summary, removes audio
- Best for: Saving storage while keeping the processed content
- Note: You can’t regenerate transcript or play audio after this
- Everything: Removes audio file, transcript, and summary
- Complete removal from device and iCloud (if syncing)
- Cannot be undone
- Summary Only: Keeps audio and transcript, removes AI-generated summary
- Useful if you want to regenerate summary with different AI engine
- Can regenerate summary anytime
⚠️ Important: Deletion is permanent. Make sure you have backups if needed.
- Audio File Only: Keeps transcript and summary, removes audio
- Confirmation: The app will ask you to confirm the deletion to prevent accidents
- Background Cleanup: After deletion, the app automatically:
- Removes files from device storage
- Updates iCloud sync (if enabled)
- Cleans up any orphaned data
- Updates the recordings list
🎙️ Recording Features
iPhone Action Button Integration
📱 Available On: iPhone 15 Pro, iPhone 15 Pro Max, iPhone 16 Pro, iPhone 16 Pro Max, and future iPhone Pro models with Action Button
BisonNotes AI supports the iPhone Action Button, allowing you to quickly start recording without opening the app first. This is perfect for capturing thoughts, meetings, or voice notes instantly.
How to Configure the Action Button
- Open Settings: On your iPhone, open the Settings app
- Navigate to Action Button: Scroll down and tap “Action Button”
- Select Shortcut: Choose “Shortcut” as the Action Button function
- Choose BisonNotes AI:
- Tap “Choose a Shortcut”
- Search for “Start Recording” or “BisonNotes AI”
- Select “Start Recording” from BisonNotes AI
- Done: Press the Action Button to test – it should launch BisonNotes AI and start recording automatically!
What Happens When You Press the Action Button
- App Opens: BisonNotes AI launches automatically (even if the app was closed)
- Switches to Recordings Tab: The app automatically navigates to the main recording screen
- Recording Starts Immediately: Recording begins automatically without needing to tap the microphone button
- Background Recording: Once started, recording continues even if you switch to another app or lock your phone
💡 Pro Tip: The Action Button works even when your phone is locked! Press the Action Button, and BisonNotes AI will launch and start recording. This makes it perfect for quick voice notes without unlocking your phone.
Location Tracking
- Automatic: GPS location is captured with each recording
- Manual: Add or edit location later in summary view
- Privacy: Location tracking can be disabled in settings
Import Existing Audio
- Tap “Import Audio Files” on the main screen
- Select audio files from your device
- Files are automatically added to your recordings library
Import via Share Extension
📎 Share from Other Apps: You can import audio files directly from Voice Memos, Files, and other apps using the iOS share sheet — no need to export and re-import manually.
- Open the Source App: Open Voice Memos, Files, or any app containing the audio file you want to import
- Tap Share: Use the standard iOS share button and select “BisonNotes AI” from the share sheet
- Automatic Import: The file is saved to a shared container and BisonNotes AI opens automatically to import it
- Background Import: If BisonNotes AI is already running in the background, it will detect the new file immediately via a Darwin notification and import it without you needing to switch apps
📋 Supported File Types
- Audio: M4A, MP3, WAV, CAF, AIFF, AIF
- Documents: TXT, MD, PDF, DOC, DOCX
Combining Recordings
🔗 When to Combine: Use this feature to merge two separate recordings into one continuous file. This is especially useful if your recording was interrupted (e.g., microphone disconnected) and you want to create a single combined recording.
- Access Recordings List:
- Go to the “Recordings” tab
- Tap “Select” button in the top right
- Select Two Recordings:
- Tap the checkbox next to the first recording you want to combine
- Tap the checkbox next to the second recording
- You can only select two recordings at a time
- Combine Recordings:
- Once two recordings are selected, a “Combine” button appears
- Tap “Combine” to open the combination interface
- Choose Recording Order:
📋 Order Selection
- The app automatically recommends which recording should be first based on recording dates
- You’ll see a “Recommended” badge on the suggested first recording
- Tap the “First” recording card to swap the order if needed
- The preview shows the total combined duration
- Important Requirements:
⚠️ Before Combining
Recordings with transcripts or summaries cannot be combined. You must delete any existing transcripts and/or summaries from both recordings before combining them.
- If either recording has a transcript, you’ll see a message explaining which recording has a transcript
- If either recording has a summary, you’ll see a message explaining which recording has a summary
- Delete the transcripts/summaries from both recordings, then try combining again
Why? Transcripts and summaries are tied to specific audio files. When combining recordings, you’ll need to regenerate transcripts and summaries for the new combined file.
- Complete the Combination:
- Review the combined duration preview
- Tap “Combine Recordings” to merge the files
- The app will create a new combined recording file
- The original recordings remain unchanged
- After Combining:
- The new combined recording appears in your recordings list
- You can generate a new transcript for the combined recording
- You can generate a new summary for the combined recording
- The original two recordings remain available if you need them
💡 Tips for Combining Recordings
- Best Use Case: Combining recordings that were split due to microphone disconnection or app interruption
- Order Matters: Make sure the recordings are in the correct chronological order
- Storage: The combined file will be the sum of both original file sizes
- Processing: After combining, you’ll need to generate new transcripts and summaries for the combined recording
🤖 AI Engine Configuration
Overview
BisonNotes AI supports multiple AI engines for transcription and summarization. Each has different capabilities and requirements.
1. On-Device AI
On-device FreeType: On-device AI processing using local LLM models
Cost: Free
Privacy: 100% local
Internet: Required only for initial model download
Requirements:
- Transcription: iOS 17.0+, 4GB+ RAM (most modern iPhones and iPads)
- AI Summary: iPhone 15 Pro, iPhone 16 or newer, iOS 18.1+
- Storage: 2-3GB for AI models, 150-520MB for transcription model
Setup:
- Select “On-Device AI” in simple settings
- Download On Device transcription model (Higher Quality or Faster Processing)
- Download one or more AI summary models
- Uses On Device transcription for transcription (requires model download)
- Uses downloaded LLM models for AI summaries
Available Models:
- Recommended Models (by device RAM):
- 8GB+ RAM: Granite 4.0 H Tiny (4.3 GB) – Recommended for best quality
- 6GB+ RAM: Granite 4.0 Micro (2.1 GB) – Recommended for fast processing
- 6GB+ RAM: Gemma 3n E2B (3.0 GB) – Good quality, smaller size
- 8GB+ RAM: Gemma 3n E4B (4.5 GB) – Best overall quality
- 6GB+ RAM: Ministral 3B (2.1 GB) – Best for tasks and reminders
- Experimental Models (enable in settings):
- 4GB+ RAM: LFM 2.5 1.2B (731 MB) – Fast, minimal summaries (summary only)
- 4GB+ RAM: Qwen3 1.7B (1.1 GB) – Latest Qwen3 model (summary only)
- 8GB+ RAM: Qwen3 4B (2.7 GB) – Excellent detail extraction
Best for: Privacy-conscious users, offline use, recordings under 60 minutes
Limitations:
- Best for recordings under 60 minutes
- Requires 2-3GB storage per model
- May be less accurate than cloud services
2. OpenAI Integration
Cloud-based Pay-per-useType: Cloud-based AI
Cost: Pay-per-use (very affordable)
Privacy: Data sent to OpenAI
Internet: Required
Best for: High-quality results, advanced features
3. Google AI Studio
Cloud-based Free tierType: Cloud-based AI
Cost: Free tier available, then pay-per-use
Privacy: Data sent to Google
Internet: Required
Available Models:
- Gemini 2.5 Flash (Default): Fast and efficient for most tasks
- Gemini 2.5 Flash Lite: Lightweight variant for quick processing
- Gemini 3 Pro Preview: Advanced reasoning and analysis (Preview)
- Gemini 3 Flash Preview: Fast next-generation model (Preview)
Best for: Balanced performance and cost, with free tier for getting started
4. OpenAI API Compatible
Cloud/Local FlexibleType: OpenAI-compatible API endpoint
Cost: Varies by provider
Privacy: Depends on provider
Internet: Required (unless local server)
Supported Providers: This single engine option works with multiple providers:
- LiteLLM: Open-source proxy for multiple providers
- llama.cpp: High-performance local LLM inference server
- Nebius: Cloud provider with OpenAI-compatible API
- Groq: Fast inference with OpenAI-compatible API
- Custom Servers: Any OpenAI-compatible endpoint
Setup:
- Go to Settings → AI Settings → OpenAI API Compatible
- Enter your API key (from your chosen provider, or use “no-key” for local servers)
- Set base URL to your provider’s endpoint:
- Groq:
https://api.groq.com/v1 - Nebius: Your Nebius endpoint URL
- LiteLLM: Your LiteLLM server URL
- llama.cpp: Your llama.cpp server URL (default:
http://localhost:8080)
- Groq:
- Select model (use your provider’s model name)
- Test the connection
Best for: Using LiteLLM, llama.cpp, Nebius, Groq, or other OpenAI-compatible services
5. Mistral AI
Cloud-based Pay-per-useType: Cloud-based AI processing and transcription
Cost: Pay-per-use (summarization varies by model; transcription $0.003/min)
Privacy: Data sent to Mistral AI
Internet: Required
Summarization Models:
- Mistral Large (25.12): Most capable model, 128K context window (Premium tier)
- Mistral Medium (25.08): Balanced performance and cost, 128K context (Standard tier)
- Magistral Medium (25.09): Economy option, 40K context window (Economy tier)
Transcription:
- Voxtral Mini Transcribe: Speech-to-text at $0.003/min with optional speaker diarization
- Supports MP3, MP4, M4A, WAV, FLAC, OGG, WebM
- Automatic language detection or explicit language code
Setup:
- Go to Settings → AI Settings → Mistral AI
- Enter your Mistral API key
- Select summarization model (Large, Medium, or Magistral)
- For transcription: go to Settings → Transcription Settings and select “Mistral AI”
- Test the connection
Best for: Fast, high-quality summaries and affordable cloud transcription with speaker diarization
6. AWS Bedrock
Cloud-based Pay-per-useType: Cloud-based AI
Cost: Pay-per-use
Privacy: Data sent to AWS
Internet: Required
Best for: Enterprise features
7. Ollama
Local AI FreeType: Local LLM server
Cost: Free (requires your own server)
Privacy: 100% local
Internet: Not required for processing
Setup:
- Install Ollama server on your local machine or network
- Go to Settings → AI Settings → Ollama
- Set server URL (default:
http://localhost) - Set port (default: 11434)
- Select model (llama2:7b, qwen3:30b, etc.)
- Test the connection
Best for: Privacy, customizable models, offline use
Setup Instructions for Each Engine
OpenAI Integration
- Get API Key: Visit platform.openai.com
- Create Account: Sign up for an OpenAI account
- Generate API Key: Go to API Keys section and create a new key
- Configure in App:
- Go to Settings → AI Settings → OpenAI
- Enter your API key
- Select your preferred model
- Test the connection
Available OpenAI Models
| Model | Type | Best For | Tier |
|---|---|---|---|
| GPT-4.1 | Summarization | Most robust and comprehensive analysis with advanced reasoning capabilities | Premium |
| GPT-4.1 Mini | Summarization | Balanced performance and cost, suitable for most summarization tasks (Default) | Standard |
| GPT-4.1 Nano | Summarization | Fastest and most economical for basic summarization needs | Economy |
| GPT-5 Mini | Summarization | Next-generation model with enhanced reasoning and efficiency | Premium |
OpenAI Transcription Models
| Model | Type | Best For |
|---|---|---|
| GPT-4o Transcribe | Transcription | Most robust transcription with GPT-4o model. Supports streaming. |
| GPT-4o Mini Transcribe | Transcription | Cheapest and fastest transcription with GPT-4o Mini model. Supports streaming. Recommended for most use cases. |
| Whisper-1 | Transcription | Legacy transcription with Whisper V2 model. Does not support streaming. |
Google AI Studio Integration
- Get API Key: Visit aistudio.google.com
- Create Account: Sign up for Google AI Studio
- Generate API Key: Create a new API key
- Configure in App:
- Go to Settings → AI Settings → Google AI Studio
- Enter your API key
- Select model:
- Gemini 2.5 Flash (Default): Fast and efficient for most tasks
- Gemini 2.5 Flash Lite: Lightweight variant for quick processing
- Gemini 3 Pro Preview: Advanced reasoning and analysis (Preview)
- Gemini 3 Flash Preview: Fast next-generation model (Preview)
- Test the connection
OpenAI API Compatible Integration
📌 Important Note
OpenAI API Compatible is a single engine option that works with multiple providers. You don’t select Nebius, Groq, LiteLLM, or vLLM as separate engines – instead, you configure the “OpenAI API Compatible” option with your chosen provider’s settings.
- Choose Your Provider: Select one of these OpenAI-compatible services:
- LiteLLM: Open-source proxy for multiple providers
- Base URL: Your LiteLLM server URL (e.g.,
http://localhost:4000/v1) - Get API key from your LiteLLM configuration
- Documentation: github.com/BerriAI/litellm
- Base URL: Your LiteLLM server URL (e.g.,
- llama.cpp: High-performance local LLM inference server
- Base URL: Your llama.cpp server URL (default:
http://localhost:8080) - API key: Use “no-key” or leave empty for local servers
- Installation: Clone from github.com/ggerganov/llama.cpp, build with
make, then run./server --model <model_file> - Documentation: See llama.cpp README for full setup instructions
- Alternative: Use
llama-cpp-pythonwithpip install 'llama-cpp-python[server]'and runpython3 -m llama_cpp.server --model /path/to/model.gguf
- Base URL: Your llama.cpp server URL (default:
- Nebius: Cloud provider with OpenAI-compatible API
- Base URL: Your Nebius endpoint URL
- Get API key from Nebius console
- Groq: Fast inference with OpenAI-compatible API
- Base URL:
https://api.groq.com/v1 - Get API key from console.groq.com
- Base URL:
- Custom Server: Your own OpenAI-compatible endpoint
- Base URL: Your server’s endpoint URL
- API key: As configured on your server
- LiteLLM: Open-source proxy for multiple providers
- Get API Key: Obtain an API key from your chosen provider (if required)
- Configure in App:
- Go to Settings → AI Settings → OpenAI API Compatible
- Enter your API key (from your chosen provider)
- Set base URL to your provider’s endpoint (see examples above)
- Select model (use your provider’s exact model name, e.g.,
llama-3.1-70b-versatilefor Groq) - Configure temperature and max tokens
- Test the connection
💡 OpenAI Compatible Tips
- Single Engine Option: All providers (Nebius, Groq, LiteLLM, llama.cpp) use the same “OpenAI API Compatible” engine option – just change the base URL and API key
- Base URL Examples:
- Groq:
https://api.groq.com/v1 - Local llama.cpp:
http://localhost:8080(default port) - Local LiteLLM:
http://localhost:4000/v1 - Nebius: Your Nebius endpoint URL
- Groq:
- Model Names: Use the exact model name your provider supports (e.g., Groq uses names like
llama-3.1-70b-versatile, notgpt-4o) - Local Servers: For local servers (llama.cpp, LiteLLM), ensure your device can reach the server IP address
- llama.cpp Setup:
- Native server: Clone llama.cpp, build with
make, run./server --model <model.gguf> - Python server: Install with
pip install 'llama-cpp-python[server]', runpython3 -m llama_cpp.server --model /path/to/model.gguf - Default port is 8080, API key can be “no-key” or empty
- Native server: Clone llama.cpp, build with
- Transcription: OpenAI API Compatible is also available for transcription in Settings → Transcription Settings
Mistral AI Integration
- Get API Key: Visit console.mistral.ai
- Create Account: Sign up for a Mistral AI account
- Generate API Key: Create a new API key in your account settings
- Configure Summarization:
- Go to Settings → AI Settings → Mistral AI
- Enter your API key
- Select summarization model:
- Mistral Large (25.12): Most capable, 128K context (Premium)
- Mistral Medium (25.08): Balanced performance, 128K context (Standard)
- Magistral Medium (25.09): Economy option, 40K context
- Configure temperature and other parameters
- Test the connection
- Configure Transcription (Optional):
- Go to Settings → Transcription Settings
- Select “Mistral AI” as your transcription engine
- Tap “Configure” to open Mistral transcription settings
- The same API key is shared between summarization and transcription
- Enable speaker diarization if you want speaker labels
- Optionally set a language code for better accuracy
💡 Mistral Transcription Tips
- Cost: Voxtral Mini transcription costs $0.003 per minute of audio
- Diarization: When enabled, the transcription identifies and labels different speakers
- Language: Leave empty for auto-detection, or specify a code (e.g., “en”, “fr”, “es”) for better accuracy
- Supported Formats: MP3, MP4, M4A, WAV, FLAC, OGG, WebM
- Large Files: Files over 24MB are automatically chunked for processing
Ollama Integration
- Install Ollama: Visit ollama.com and install Ollama on your local machine or server
- Download Recommended Models:
ollama pull qwen3:30b ollama pull gpt-oss:20b ollama pull mistral-small3.2 - Configure in App:
- Go to Settings → AI Settings → Ollama
- Set server URL (default:
http://localhostfor local, or your server’s IP address) - Set port (default: 11434)
- Run a Model Scan: Tap the refresh button to scan your Ollama server for available models
- Select your preferred model:
- Recommended: qwen3:30b, gpt-oss:20b, mistral-small3.2
- Available models will be fetched from your Ollama server
- Configure temperature and max tokens
- Test the connection
💡 Ollama Tips
- Local Server: Use
http://localhost:11434if Ollama is running on the same device - Network Server: Use your server’s IP address (e.g.,
http://192.168.1.100:11434) - Model Scan: Always run a model scan after connecting to fetch the list of models installed on your Ollama server
- Performance: Larger models provide better results but require more RAM and processing time
AWS Bedrock Integration
- AWS Account: Create an AWS account
- Enable Bedrock: Enable AWS Bedrock service
- Create IAM User: Create user with Bedrock permissions
- Get Credentials: Generate access keys
- Configure in App:
- Go to Settings → AI Settings → AWS Bedrock
- Enter AWS credentials
- Select region
- Choose foundation model:
- Claude 4.5 Haiku (Default): Fast and efficient model optimized for quick responses (Standard tier)
- Claude Sonnet 4.5: Latest Claude Sonnet with advanced reasoning, coding, and analysis capabilities (Premium tier)
- Llama 4 Maverick 17B Instruct: Meta’s latest Llama 4 model with enhanced reasoning and performance (Economy tier)
- Test the connection
Whisper Integration (Local Server)
- Install Whisper Server:
# Using Docker (recommended) docker run -d -p 9000:9000 \ -e ASR_MODEL=base \ -e ASR_ENGINE=openai_whisper \ onerahmet/openai-whisper-asr-webservice:latest - Configure in App:
- Go to Settings → Transcription Settings → Whisper (Local Server)
- Set server URL (e.g.,
http://localhostorhttp://192.168.1.100) - Set port (default: 9000)
- Select protocol (REST API or Wyoming)
- Select model size (tiny, base, small, medium, large-v3)
- Test the connection
💡 Whisper Protocol Options
- REST API: Traditional HTTP REST API with file uploads
- Wyoming: Modern streaming protocol with WebSocket connection
AWS Transcribe Integration
- AWS Account: Create an AWS account
- Enable Transcribe: Enable AWS Transcribe service in your AWS console
- Create IAM User: Create user with Transcribe permissions
- Get Credentials: Generate access keys
- Configure in App:
- Go to Settings → Transcription Settings → AWS Transcribe
- Enter AWS access key ID
- Enter AWS secret access key
- Select region (e.g., us-east-1, eu-west-1)
- Choose language
- Test the connection
💡 AWS Transcribe Tips
- IAM Permissions: Ensure your IAM user has
transcribe:StartTranscriptionJobandtranscribe:GetTranscriptionJobpermissions - Regions: Choose a region close to you for better performance
- Cost: AWS Transcribe charges per minute of audio transcribed
On-Device AI Setup
- Select On-Device AI: Choose “On-Device AI” in the simple settings page
- Download Transcription Model:
- You’ll be prompted to configure On Device transcription
- Go to Settings → Transcription Settings → On Device
- Download either “Higher Quality” (~520MB) or “Faster Processing” (~150MB) model
- See On Device Transcription Setup section above for detailed model selection guide
- Download AI Summary Models: After saving, you’ll be taken to the On-Device AI settings page:
- Select one or more AI summary models to download
- Available models:
- Recommended Models (by device RAM):
- 8GB+ RAM: Granite 4.0 H Tiny (4.3 GB) – Recommended for best quality
- 6GB+ RAM: Granite 4.0 Micro (2.1 GB) – Recommended for fast processing
- 6GB+ RAM: Gemma 3n E2B (3.0 GB) – Good quality, smaller size
- 8GB+ RAM: Gemma 3n E4B (4.5 GB) – Best overall quality
- 6GB+ RAM: Ministral 3B (2.1 GB) – Best for tasks and reminders
- Experimental Models (enable in settings):
- 4GB+ RAM: LFM 2.5 1.2B (731 MB) – Fast, minimal summaries (summary only)
- 4GB+ RAM: Qwen3 1.7B (1.1 GB) – Latest Qwen3 model (summary only)
- 8GB+ RAM: Qwen3 4B (2.7 GB) – Excellent detail extraction
- Recommended Models (by device RAM):
- Download progress is shown with speed and time remaining
- Model Selection: Choose which model to use for summaries
- Configuration: Adjust generation settings (temperature, max tokens, etc.)
- Transcription: Uses On Device transcription (requires model download as described above)
💾 Storage Requirements
On-Device AI requires 2-3GB of storage per AI summary model, plus 150-520MB for the On Device transcription model. Make sure you have sufficient free space before downloading models.
📱 Device Requirements
- Transcription: iOS 17.0+, 4GB+ RAM (most modern iPhones and iPads)
- AI Summary: iPhone 15 Pro, iPhone 16 or newer, iOS 18.1+ (requires more processing power)
📝 Transcription Configuration
🔒 On Device Transcription
For complete privacy, use On Device transcription. Your audio is processed entirely on your device and never leaves your iPhone or iPad. See the On Device Transcription Setup section below for complete setup instructions.
Engine Selection
- Go to Settings → Transcription Settings
- Select your preferred transcription engine
- Configure the selected engine (if required)
- Test the connection
Available Transcription Engines
- On Device: High-quality on-device transcription. Your audio never leaves your device, ensuring complete privacy. Requires model download (~150-520MB). Best for privacy-conscious users.
- OpenAI: Cloud-based transcription using OpenAI’s GPT-4o models and Whisper API. Supports multiple models:
- GPT-4o Transcribe: Most robust transcription with GPT-4o model. Supports streaming for real-time transcription.
- GPT-4o Mini Transcribe: Cheapest and fastest transcription with GPT-4o Mini model. Supports streaming. Recommended for most use cases.
- Whisper-1: Legacy transcription with Whisper V2 model. Does not support streaming.
- Mistral AI: Cloud-based transcription using Mistral’s Voxtral Mini model ($0.003/min). Features:
- Speaker Diarization: Optional — identifies and labels different speakers in the audio
- Language Support: Automatic detection or explicit language code for better accuracy
- Supported Formats: MP3, MP4, M4A, WAV, FLAC, OGG, WebM
- Large Files: Automatic chunking for files over 24MB
- Whisper (Local Server): High-quality transcription using OpenAI’s Whisper model on your local server (REST API or Wyoming protocol)
- AWS Transcribe: Cloud-based transcription service with support for long audio files
On Device Transcription Setup
🔒 Privacy-First Transcription
On Device transcription processes your audio entirely on your device. Your audio files never leave your iPhone or iPad, ensuring complete privacy and security.
Initial Setup
- Enable On Device:
- Go to Settings → Transcription Settings
- Select “On Device” as your transcription engine
- Tap “Configure” to open On Device settings
- Download a Model:
- In On Device settings, you’ll see two model options:
- Higher Quality (~520MB): Best accuracy and quality. Takes longer to process but produces more accurate transcriptions.
- Faster Processing (~150MB): Faster transcription with good quality. Ideal for quick transcriptions with slightly lower accuracy.
- Tap the download button next to your preferred model
- Wait for download to complete (progress is shown with speed and time remaining)
- Model download requires internet connection, but transcription works offline after download
- In On Device settings, you’ll see two model options:
- Device Requirements:
- RAM: Requires 4GB+ RAM (most modern iPhones and iPads)
- iOS: iOS 17.0 or later
- Storage: 150-520MB free space depending on model selected
- Test the Model:
- Once downloaded, tap “Test Model Loading” to verify it works
- First load may take 30-60 seconds as the model compiles
- If you see errors, try deleting and re-downloading the model
Model Selection Guide
📋 Which Model Should I Use?
| Use Case | Recommended Model | Why? |
|---|---|---|
| Voice Notes / Journaling | Faster Processing | You are close to the mic; audio is clear. Speed is better. |
| Meeting / Interview | Higher Quality | Handling multiple voices and distance from the mic requires the extra accuracy. |
| Noisy Environment | Higher Quality | Faster Processing will fail to separate voice from noise. |
| Long Battery Life Needed | Faster Processing | Higher Quality uses significantly more power per second of audio. |
Using On Device Transcription
- Record Audio: Create a recording as usual
- Generate Transcript:
- Open your recording
- Tap “Generate Transcript”
- On Device transcription will process your audio locally
- Processing Time:
- Faster Processing: Approximately 3x faster than Higher Quality
- Higher Quality: Takes longer but produces more accurate results
- Processing time depends on audio length and device performance
- View Results: Once complete, your transcript appears with full text and timestamps
✅ Benefits of On Device Transcription
- Complete Privacy: Your audio never leaves your device
- Works Offline: No internet required after model download
- No API Costs: Free to use after initial model download
- Fast Processing: Especially with “Faster Processing” model
- High Quality: Excellent accuracy with “Higher Quality” model
⚠️ Important Notes
- Model Download: Requires internet connection and sufficient storage space
- First Load: First transcription after download may take 30-60 seconds as the model compiles
- Device Compatibility: Some older devices may not support On Device transcription
- Battery Usage: Higher Quality model uses more battery than Faster Processing
- Storage: Models take 150-520MB of storage space
Large File Processing
- Automatic Chunking: Files over 5 minutes are automatically split
- Progress Tracking: Real-time progress updates
- Background Processing: Continues when app is minimized
- Timeout Settings: Configurable processing time limits
📊 Working with Summaries
Viewing Summaries
- Tap the “Summaries” tab
- Browse your recordings with AI-generated summaries
- Tap any summary to view details
Summary Features
- Expandable Sections: Tap to expand/collapse sections
- Task Extraction: AI-identified actionable items
- Reminder Detection: Time-sensitive reminders
- Priority Indicators: Color-coded task priorities
- Location Maps: Interactive maps showing recording location
Search and Filtering
BisonNotes AI includes powerful search and filtering capabilities to help you find your recordings, transcripts, and summaries quickly.
Search Functionality
Search is available in three main views:
- Summaries View: Search across summary content, tasks, reminders, titles, and recording names
- Transcripts View: Search through transcript text and recording names
- Recordings View: Search by recording name
How to use:
- Tap the search bar at the top of any view
- Type your search terms
- Results filter in real-time as you type
- Search is case-insensitive and matches partial text
Date Filters
Date range filtering helps you find content from specific time periods:
- Available in: Summaries, Transcripts, and Recordings views
- How to use:
- Tap the filter icon (three horizontal lines with circle) in the navigation bar
- Select a start date and end date
- Tap “Apply” to filter results
- The active filter is shown with a banner at the top of the list
- Tap the X on the banner to clear the filter
💡 Filter Behavior
- Filters can be combined with search for precise results
- Date range includes the full day (00:00:00 to 23:59:59) for both start and end dates
- Filters persist until manually cleared
Editing Recording Metadata
Changing Recording Title
- Open a summary
- Scroll to “Titles” section
- Tap “Edit” next to any title
- Enter new title or select from AI-generated alternatives
- Tap “Use This Title”
Setting Custom Date & Time
- Open a summary
- Scroll to “Recording Date & Time” section
- Tap “Set Custom Date & Time”
- Use date and time pickers
- Tap “Save”
Adding/Editing Location
- Open a summary
- In the location section, tap “Add Location” or “Edit Location”
- Choose from:
- Current Location: Use device GPS
- Map Selection: Pick location on map
- Manual Entry: Enter coordinates manually
- Tap “Save”
🎵 Audio Playback
Basic Playback
- Go to “Recordings” tab
- Tap any recording to play
- Use playback controls:
- Play/Pause: Center button
- Skip 15s: Side buttons
- Scrub: Drag progress bar
Advanced Playback
- Seek Control: Drag the scrubber for precise positioning
- Background Playback: Audio continues when app is minimized
- Audio Session Management: Handles interruptions gracefully
⚙️ Settings & Configuration
Simple Settings vs Advanced Settings
🎯 Understanding the Two Settings Interfaces
BisonNotes AI has two settings interfaces designed for different use cases:
Simple Settings Page
The simple settings page appears on first launch and provides quick setup for the most common configurations:
- Automatic Detection: The page automatically detects your current configuration:
- If both AI engine and transcription are set to “OpenAI” → Shows “OpenAI” option
- If AI engine is “On-Device AI” and transcription is “On Device” → Shows “On-Device AI” option
- Any other configuration → Automatically shows “Advanced & Other Options”
- Preserves Settings: If you manually switch to “Advanced & Other Options”, your current settings are preserved (not reset to blank)
- Quick Access: Tap “Advanced Options” button to access full settings at any time
- Immediate Action:
- Selecting “OpenAI” → Enter API key and save
- Selecting “On-Device AI” → Saves and immediately opens On-Device AI settings to download models
- Selecting “Advanced & Other Options” → Saves and immediately opens advanced settings page
Advanced Settings Page
The advanced settings page provides full control over all configuration options:
- AI Processing: Configure AI summarization engines
- On-Device AI
- OpenAI
- Google AI Studio
- AWS Bedrock
- Mistral AI
- OpenAI API Compatible
- Ollama
- Transcription Engine: Configure transcription engines
- On Device
- OpenAI
- Whisper (Local Server)
- AWS Transcribe
- OpenAI API Compatible
- Microphone Selection: Choose from available microphones (appears below AI and Transcription sections)
- Preferences: Display preferences, time format, etc.
- Advanced Settings: Location services, iCloud sync, background processing
🔄 Switching Between Simple and Advanced
- From Simple to Advanced: Tap “Advanced Options” button in the top right
- From Advanced to Simple: The simple settings page automatically detects your configuration when you return to it
- Automatic Updates: If you change settings in advanced options, the simple settings page will automatically switch to “Advanced & Other Options” when you return
Audio Settings
- Quality: Whisper Optimized (22 kHz, 64 kbps AAC) – Optimized for voice transcription
- Microphone Selection:
- Choose from available microphones (built-in, Bluetooth, USB devices)
- Your selection is saved and used for all recordings
- If selected microphone becomes unavailable, automatically falls back to iOS default
- During recording, if microphone disconnects, recording continues with default microphone
- Mixed Audio: Record without interrupting system audio
- Background Recording: Continue recording when app is minimized
AI Settings
- Engine Selection: Choose your preferred AI engine for summaries:
- On-Device AI: Privacy-focused local AI processing
- OpenAI: Cloud-based AI with GPT models
- Google AI Studio: Gemini AI processing
- AWS Bedrock: Enterprise-grade Claude AI
- Mistral AI: Advanced AI processing with Mistral models
- OpenAI API Compatible: Connect to OpenAI-compatible endpoints (LiteLLM, llama.cpp, Nebius, Groq, etc.)
- Ollama: Local LLM server for privacy-focused processing
- Model Configuration: Adjust settings for selected engine (temperature, max tokens, etc.)
- Connection Testing: Verify API connectivity
- Batch Regeneration: Update all summaries with new engine
Background Processing
- Job Management: View active and completed jobs
- Progress Tracking: Monitor long-running operations
- Error Recovery: Automatic retry and error handling
- Performance Monitoring: Real-time metrics
Data Management
- Migration Tools: Import legacy data
- Database Maintenance: Clear and repair data
- File Relationships: Manage audio, transcript, and summary files
- Debug Tools: Advanced troubleshooting options
🔧 Troubleshooting
Common Issues
Recording Problems
- No Audio: Check microphone permissions
- Poor Quality: Adjust audio quality settings
- Background Recording: Enable in settings
AI Engine Issues
- Connection Failed: Check internet and API keys
- Timeout Errors: Increase timeout settings
- Authentication Errors: Verify API credentials
Transcription Problems
- No Transcription: Check engine configuration
- Poor Quality: Try different engine or model
- Large File Issues: Enable chunking for files over 5 minutes
Data Issues
- Missing Recordings: Use Data Migration tools
- Corrupted Data: Clear and re-import data
- Sync Problems: Check iCloud settings
Performance Optimization
- Battery Life: Use local engines for offline processing
- Memory Usage: Close other apps during large file processing
- Storage: Regularly clean up old recordings
- Network: Use local engines to reduce data usage
📱 Advanced Features
Background Processing
- Job Queue: Multiple operations run in background
- Progress Tracking: Real-time updates for long operations
- Error Recovery: Automatic retry for failed operations
- Stale Job Cleanup: Automatic cleanup of abandoned jobs
File Management
- Import/Export: Support for various audio formats (M4A, MP3, WAV, CAF, AIFF, AIF)
- Share Extension: Import audio directly from Voice Memos, Files, and other apps via the iOS share sheet (see Share Extension section)
- Combining Recordings: Merge two recordings into one continuous file (see Combining Recordings section)
- PDF Export: Professional PDF reports with three-pane header (metadata, local map, regional map), page numbers, and dedicated tasks/reminders sections
- File Relationships: Maintains connections between audio, transcripts, and summaries
- Orphaned File Detection: Identifies and manages disconnected files
- Selective Deletion: Choose what to keep when deleting recordings
Location Intelligence
- GPS Integration: Automatic location capture
- Reverse Geocoding: Converts coordinates to addresses
- Smart Location Search: Advanced search with 3-tier fallback system
- University Database: Built-in mapping for major universities
- Search Retry Logic: Intelligent retry for failed searches
- Interactive Maps: View recording locations
- Manual Location: Add locations after recording
- Performance Optimized: Background processing prevents UI blocking
Data Migration
- Legacy Import: Migrate from old file-based storage
- Data Integrity: Validate and repair data relationships
- Batch Operations: Process multiple files at once
- Progress Tracking: Monitor migration progress
🎯 Best Practices
Recording
- Environment: Record in quiet environments for best quality
- Distance: Keep microphone 6-12 inches from mouth
- Duration: Break long recordings into segments
- Background: Minimize background noise
AI Configuration
- Privacy: Use local engines for sensitive content
- Cost: Start with free engines, upgrade as needed
- Quality: Experiment with different models for best results
- Reliability: Have backup engines configured
Data Management
- Regular Backups: Export important recordings
- Cleanup: Remove old recordings periodically
- Organization: Use descriptive titles for easy finding
- Metadata: Add location and custom dates for context
Performance
- Battery: Use local engines when battery is low
- Storage: Monitor available space
- Network: Use local engines when internet is slow
- Memory: Close other apps during processing
🔗 External Resources
AI Service Documentation
- OpenAI: platform.openai.com/docs
- Google AI: ai.google.dev
- AWS Bedrock: docs.aws.amazon.com/bedrock
- AWS Transcribe: docs.aws.amazon.com/transcribe
Additional Resources
- Mistral AI: docs.mistral.ai
- Ollama: ollama.com
- OpenAI API Compatible:
- LiteLLM: github.com/BerriAI/litellm
- llama.cpp: github.com/ggerganov/llama.cpp – Official repository with installation and server setup instructions
- llama-cpp-python: llama-cpp-python.readthedocs.io – Python bindings with server support
- Nebius: nebius.com
- Groq: groq.com
- Whisper: github.com/ahmetoner/whisper-asr-webservice
- AWS Transcribe: aws.amazon.com/transcribe/
Support
- GitHub Issues: Report bugs and request features
- Documentation: Check the README for technical details
- Community: Join discussions and share tips
🎯 Ready to Get Started?
BisonNotes AI – Transform your spoken words into actionable intelligence with advanced AI processing and comprehensive data management.
Download the app and start recording your first BisonNotes today!
This documentation is regularly updated. For the latest information, check the app’s built-in help or visit our support resources.
