Vision: Support multiple Ollama services and parallel jobs #2453

Open
opened 2026-02-20 01:11:46 -05:00 by deekerman · 1 comment
Owner

Originally created by @alexislefebvre on GitHub (Dec 9, 2025).

Confirmation

  • I checked this request against the roadmap and existing issues

What Problem Does This Solve and Why Is It Valuable?

Today we can setup Vision like this:

Models:
- Type: caption
  Model: gemma3:latest
  Engine: ollama
  Run: auto
  Prompt: >
    Create a caption with exactly one sentence in the active voice that
    describes the main visual content. Begin with the main subject and
    clear action. Avoid text formatting, meta-language, and filler words.
  Service:
    Uri: http://ollama:11434/api/generate

Source: https://docs.photoprism.app/user-guide/ai/ollama-models/#gemma-3-caption

What Solution Would You Like?

I would like to be able to use several instances of Ollama, with something like this:

Models:
- Type: caption
  Model: gemma3:latest
  Engine: ollama
  Run: auto
  Prompt: >
    Create a caption with exactly one sentence in the active voice that
    describes the main visual content. Begin with the main subject and
    clear action. Avoid text formatting, meta-language, and filler words.
  Service:
    Uri: http://ollama:11434/api/generate
- Type: caption
  Model: gemma3:latest
  Engine: ollama
  Run: auto
  Prompt: >
    Create a caption with exactly one sentence in the active voice that
    describes the main visual content. Begin with the main subject and
    clear action. Avoid text formatting, meta-language, and filler words.
  Service:
    Uri: http://another.computer.local:11434/api/generate

Then Photoprism could parallelize the Vision jobs on these 2 servers.

And if one server is down, it would call the other one instead.

It could also be used to configure a local instance and a SaaS instance.

It may also support a Priority: value:

  • if all services have the same priority, they are all called
  • if services have different priorities, the one with the highest priority is called first, and the second is called only if the first is not available

What Alternatives Have You Considered?

.

Additional Context

Well, GPUs are pretty expensive these days, so that won’t be a common situation.

And adding parallelization may be tricky.

Yet I think that it would be interesting, even to use it at fallbacks between several servers.

Originally created by @alexislefebvre on GitHub (Dec 9, 2025). ### Confirmation - [x] I checked this request against the roadmap and existing issues ### What Problem Does This Solve and Why Is It Valuable? Today we can setup Vision like this: ```yaml Models: - Type: caption Model: gemma3:latest Engine: ollama Run: auto Prompt: > Create a caption with exactly one sentence in the active voice that describes the main visual content. Begin with the main subject and clear action. Avoid text formatting, meta-language, and filler words. Service: Uri: http://ollama:11434/api/generate ``` Source: https://docs.photoprism.app/user-guide/ai/ollama-models/#gemma-3-caption ### What Solution Would You Like? I would like to be able to use several instances of Ollama, with something like this: ```yaml Models: - Type: caption Model: gemma3:latest Engine: ollama Run: auto Prompt: > Create a caption with exactly one sentence in the active voice that describes the main visual content. Begin with the main subject and clear action. Avoid text formatting, meta-language, and filler words. Service: Uri: http://ollama:11434/api/generate - Type: caption Model: gemma3:latest Engine: ollama Run: auto Prompt: > Create a caption with exactly one sentence in the active voice that describes the main visual content. Begin with the main subject and clear action. Avoid text formatting, meta-language, and filler words. Service: Uri: http://another.computer.local:11434/api/generate ``` Then Photoprism could parallelize the Vision jobs on these 2 servers. And if one server is down, it would call the other one instead. It could also be used to configure a local instance and a SaaS instance. It may also support a `Priority:` value: - if all services have the same priority, they are all called - if services have different priorities, the one with the highest priority is called first, and the second is called only if the first is not available ### What Alternatives Have You Considered? . ### Additional Context Well, GPUs are pretty expensive these days, so that won’t be a common situation. And adding parallelization may be tricky. Yet I think that it would be interesting, even to use it at fallbacks between several servers.
Author
Owner

@lastzero commented on GitHub (Feb 11, 2026):

Interesting idea! We'll consider it when we resume work on Ollama/AI. 🤖

@lastzero commented on GitHub (Feb 11, 2026): Interesting idea! We'll consider it when we resume work on Ollama/AI. 🤖
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/photoprism#2453
No description provided.