mirror of
https://github.com/Mintplex-Labs/anything-llm.git
synced 2026-03-02 22:57:05 -05:00
[FEAT]: Support custom or downloadable ASR models for Meeting Assistant (e.g., Whisper, VibeVoice-ASR) #3113
Labels
No labels
Desktop
Docker
Integration Request
Integration Request
OS: Linux
OS: Mobile
OS: Windows
UI/UX
blocked
bug
bug
core-team-only
documentation
duplicate
embed-widget
enhancement
feature request
github_actions
good first issue
investigating
needs info / can't replicate
possible bug
question
stage: specifications
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/anything-llm#3113
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @skyking363 on GitHub (Jan 23, 2026).
What would you like to see?
Description
I would like to suggest adding the ability for users to download or load custom ASR models for the Meeting Assistant.
Currently, the default
parakeet-tdt-0.6b-v3is extremely fast but only supports 25 European languages. To support more languages like Chinese (Traditional/Simplified), users need the flexibility to choose more capable models.Use Case
Many users in Asia require high-accuracy transcription for CJK languages. By allowing us to point to a specific model, we can balance speed and language support according to our own hardware and needs.
Proposed Solution
Instead of a hardcoded model, please consider adding a "Model Settings" section in the Meeting Assistant:
This would make the Meeting Assistant truly "Anything" and accessible to a global audience.
Thank you for your hard work on this amazing tool!
@tylerrobb commented on GitHub (Jan 23, 2026):
You read my mind! I was browsing the Hugging Face Open ASR Leaderboard and wanted to test out nvidia/canary-qwen-2.5b or nvidia/canary-1b-v2 instead.
https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
@laweschan commented on GitHub (Jan 23, 2026):
new update:
although it is not OpenAI ASR compatible, however, it seems the ranking is quite well in testing, and multilanguage support for the following model, pls consider it as well
https://github.com/QwenLM/Qwen3-ASR
yes and I would recommend to consider the sensevoice selection, as it has quite well comment on support for chinese + cantonese + english in STThttps://github.com/FunAudioLLM/SenseVoice@LeoThomassey commented on GitHub (Feb 19, 2026):
Parakeet (might be) is good in english but miss accuracy in french for example. Whisper large is way better in multilingual but I can't use any other model as I use Vulkan and no NVIDIA GPU.
All anythingLLM is customisable but not this settings making the whole meeting assistant useless for non-english content.