Limit transcription CPU threads #5595

Open
opened 2026-02-22 11:46:22 -05:00 by deekerman · 3 comments
Owner

Originally created by @JohnXLivingston on GitHub (Jul 5, 2024).

Describe the problem to be solved

v6.2.0-RC1 comes with an incredible feature: automatic subtitles generation.

But the models that are used can use a lot of CPU and RAM.
I did not see any option to limit their usage (as we can with video transcoding).

Describe the solution you would like

Would it be possible to add some options?

Originally created by @JohnXLivingston on GitHub (Jul 5, 2024). ### Describe the problem to be solved v6.2.0-RC1 comes with an incredible feature: automatic subtitles generation. But the models that are used can use a lot of CPU and RAM. I did not see any option to limit their usage (as we can with video transcoding). ### Describe the solution you would like Would it be possible to add some options?
Author
Owner

@Chocobozzz commented on GitHub (Jul 10, 2024):

I don't think we can limit RAM but CPU yes there is a threads option

  --threads THREADS     number of threads used for CPU inference (default: 0)
@Chocobozzz commented on GitHub (Jul 10, 2024): I don't think we can limit RAM but CPU yes there is a `threads` option ``` --threads THREADS number of threads used for CPU inference (default: 0) ```
Author
Owner

@lutangar commented on GitHub (Jul 18, 2024):

RAM usage largely depends on model size since it must be loaded to RAM (multiplied by the number of runners since models aren't shared in memory).

For example with whisper-ctranslate2 (which uses faster-whisper which is CPU friendly) the models tend to be larger than the one provided for openai-whisper :

  • large-v3 ~3GB
  • medium ~1.5GB
  • tiny ~75MB

Of course, transcript quality will get worse the further you decrease the size.

https://huggingface.co/Systran

@lutangar commented on GitHub (Jul 18, 2024): RAM usage largely depends on model size since it must be loaded to RAM (multiplied by the number of runners since models aren't shared in memory). For example with `whisper-ctranslate2` (which uses `faster-whisper` which is CPU friendly) the models tend to be larger than the one provided for `openai-whisper` : - `large-v3` ~3GB - `medium` ~1.5GB - `tiny` ~75MB Of course, transcript quality will get worse the further you decrease the size. > https://huggingface.co/Systran
Author
Owner

@JohnXLivingston commented on GitHub (Jul 18, 2024):

Thanks for the information @lutangar ! Good to know.

@JohnXLivingston commented on GitHub (Jul 18, 2024): Thanks for the information @lutangar ! Good to know.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/PeerTube#5595
No description provided.