starred/photoprism

Fork 0

mirror of https://github.com/photoprism/photoprism.git synced 2026-03-02 22:57:18 -05:00

CLI: Add "photoprism vision pull" command to download AI models #2400

New issue

Open

opened 2026-02-20 01:10:59 -05:00 by deekerman · 2 comments

deekerman commented

2026-02-20 01:10:59 -05:00

Owner

Originally created by @lastzero on GitHub (Sep 14, 2025).

As an Admin/Developer who uses Ollama or other custom models in vision.yml, I want a single command that automatically downloads all referenced models, so I don't need to execute commands in containers or manually search for TensorFlow model archives.

Introduce a new CLI subcommand photoprism vision pull that automatically downloads all vision models referenced in vision.yml, so admins no longer need to exec into the ollama container and run ollama pull manually:

photoprism vision pull [flags]

For additional context and examples, see:

Technical Draft Proposal
photoprism-vision-pull-cli-draft.pdf, photoprism-vision-pull-cli-draft.md
Caption Generation, Step 2: Download Models
https://docs.photoprism.app/developer-guide/vision/caption-generation/
Computer Vision CLI Sub-Commands
https://docs.photoprism.app/developer-guide/vision/cli/

Default Behavior (no flags/args)

Read the configured models from the vision.yml file, like the other commands do (e.g. vision ls).
For each model:
- Ollama: call the local Ollama API to pull the model tag (e.g., name:version). This replaces the manual docker compose exec ollama ollama pull … step. Re-triggering a pull is safe (idempotent).
- TensorFlow: download the referenced archive (if a direct link is provided) or resolve it from a repository base URL, then extract to assets/models/<name>/ such that each model resides in a subdirectory with the same name.

Provider-specific Details

Ollama
- Precondition: the Ollama service is running and reachable (URI from model Service.Uri in vision.yml or a sensible default such as http://ollama:11434).
- Implementation: use the Ollama pull endpoint to trigger/stream progress for each tag; repeat invocations should be quick when artifacts are present.
- Error handling: if the API is unreachable or returns an error (e.g., model/tag not found), fail with a clear message and non-zero exit code.
TensorFlow
- Location: extract each model to assets/models/<name>/ (or models path configured via existing CLI/env).
- Sources:
  1. Direct archive URL per model in vision.yml (e.g., https://dl.photoprism.app/tensorflow/models/inception-v3-tensorflow2-classification-v2.tar.gz), or
  2. A repository base set via a flag or config (default could be https://dl.photoprism.app/tensorflow/models/), combined with a model-specific filename convention.
- Validation: support optional checksum verification (e.g., SHA256) per model or via companion files; refuse to use corrupted downloads.
- Idempotency: skip download if target folder exists with a completion marker and (optional) checksum matches; provide --force to re-download.

Proposed Flags (new)

--models name1,name2 Only pull the specified models; overrides the full list from vision.yml.
--dry-run Print planned actions without performing any changes; always exit 0.
--parallel N Limit concurrent downloads (default 2–4; tuneable).
--timeout 5m Per-download timeout.
--retries N Retry transient failures N times with backoff.
--ollama-uri URL Override Ollama base URI auto-detected from vision.yml.
--tf-repo URL Base URL for TensorFlow models when no direct archive URL is given (default can be https://dl.photoprism.app/tensorflow/models/).
--tf-strip-components N Strip leading path components during extraction (tar-like).
--checksum SHA256 Optional checksum override (also supported per-model in vision.yml).
--json Emit machine-readable output (one object per model).
--force Re-download and overwrite existing TensorFlow model folders.

UX and Output

Human-friendly progress with one line per model and compact summary.
When pulling via Ollama, stream or summarize progress if available.
For TensorFlow archives, show download size, rate, and extraction steps.
Log actionable errors (e.g., Ollama service not reachable at <uri>, checksum mismatch, insufficient disk space).

Config and Paths

Respect existing flags/env (e.g., --vision-yaml, --models-path) used by other vision subcommands so behavior is predictable.
Models path defaults remain unchanged; no mounts or ports are altered by this command.

Security and Networking

Do not open or expose ports. Use local or private service URIs from vision.yml or the defaults.
Ensure that no credentials are logged. Redact URLs or tokens if they are present in the configuration.

Backward Compatibility

No changes to existing behavior of vision ls / save / run.
Users may continue to pull via docker compose exec ollama … if preferred.

Testing Plan (high level)

Add/update unit tests for parsing vision.yml, provider resolution, retry/backoff, checksum verification, and idempotency markers.
Perform integration tests with a small Ollama model to verify pull, idempotency, and error handling when the daemon stops or becomes unreachable.
Perform integration tests for TensorFlow: direct archive link, repo-based resolution, re-download with the --force flag, and handling of checksum mismatches.

Documentation Updates

Update the Caption Generation docs to recommend using the photoprism vision pull command instead of executing the "pull" command in the Ollama container.
Expand the Computer Vision Commands page to include usage, flags, examples, and troubleshooting instructions.

Acceptance Criteria

The CLI exposes a new subcommand photoprism vision pull that appears in photoprism vision --help.
Running photoprism vision pull with no args reads vision.yml and attempts to pull all listed models.
Ollama models:
- Given a reachable Ollama service, the command triggers a pull for each configured tag via the Ollama API and returns exit code 0 on success.
- Re-running the command when models are already present completes quickly without error (idempotent).
- When Ollama is unreachable, the command exits non-zero and prints a clear error suggesting how to set or override the URI.
TensorFlow models:
- If a model entry specifies a direct archive URL, the archive is downloaded and extracted to assets/models//; a completion marker is written on success.
- If only a model name is given, the command resolves the URL using the repository base (default or --tf-repo) and then downloads/extracts as above.
- If the target folder already exists with a completion marker and optional checksum match, the command skips download unless --force is provided.
- If a checksum is provided and does not match, the command fails with a non-zero exit code and a clear error.
Flags:
- --models limits the operation to the specified subset.
- --dry-run performs no network or filesystem changes and exits 0 after printing the plan.
- --parallel N caps concurrent downloads to N.
- --ollama-uri overrides the Ollama base URI used for pulls.
- --tf-repo overrides the base used to resolve TensorFlow archive URLs when no direct URL is specified.
- --timeout and --retries affect both Ollama-triggered pulls (as applicable) and TensorFlow HTTP downloads.
- --json emits structured output that includes model name, provider, action (skipped/pulled/updated), status, and error (if any).
- --force re-downloads TensorFlow models even when a completion marker exists.
Config/paths:
- The command honors existing flags/env such as --vision-yaml and --models-path used by other vision subcommands.
- TensorFlow models end up in assets/models/<name>/ by default if no alternative models path is set.
Exit codes:
- Exit 0 when all requested models are already present or successfully pulled.
- Exit non-zero if any requested model fails to pull or validate.
Logging:
- Progress and summary are printed in human-readable form.
- Errors include actionable guidance (e.g., misconfigured URI, missing model, insufficient permissions, low disk space).
Tests and documentation:
- Unit and integration tests cover the behaviors above, including idempotency and error paths.
- Documentation is updated in both caption-generation and Vision CLI pages, with examples and troubleshooting notes.

Originally created by @lastzero on GitHub (Sep 14, 2025). **As an Admin/Developer who uses Ollama or other custom models in `vision.yml`, I want a single command that automatically downloads all referenced models, so I don't need to execute commands in containers or manually search for TensorFlow model archives.** Introduce a new CLI subcommand `photoprism vision pull` that automatically downloads all vision models referenced in `vision.yml`, so admins no longer need to exec into the `ollama` container and run `ollama pull` manually: ```bash photoprism vision pull [flags] ``` For additional context and examples, see: - Technical Draft Proposal [photoprism-vision-pull-cli-draft.pdf](https://github.com/user-attachments/files/22320121/photoprism-vision-pull-cli-draft.pdf), [photoprism-vision-pull-cli-draft.md](https://github.com/user-attachments/files/22320146/photoprism-vision-pull-cli-draft.md) - Caption Generation, Step 2: Download Models https://docs.photoprism.app/developer-guide/vision/caption-generation/ - Computer Vision CLI Sub-Commands https://docs.photoprism.app/developer-guide/vision/cli/ #### Default Behavior (no flags/args) 1. Read the configured models from the `vision.yml` file, like the other commands do (e.g. `vision ls`). 2. For each model: - **Ollama:** call the local Ollama API to pull the model tag (e.g., name:version). This replaces the manual `docker compose exec ollama ollama pull …` step. Re-triggering a pull is safe (idempotent). - **TensorFlow:** download the referenced archive (if a direct link is provided) or resolve it from a repository base URL, then extract to `assets/models/<name>/` such that each model resides in a subdirectory with the same name. #### Provider-specific Details - Ollama - Precondition: the [Ollama service](https://ollama.readthedocs.io/en/api/) is running and reachable (URI from model Service.Uri in `vision.yml` or a sensible default such as `http://ollama:11434`). - Implementation: use the [Ollama pull endpoint](https://ollama.readthedocs.io/en/api/#pull-a-model) to trigger/stream progress for each tag; repeat invocations should be quick when artifacts are present. - Error handling: if the API is unreachable or returns an error (e.g., model/tag not found), fail with a clear message and non-zero exit code. - TensorFlow - Location: extract each model to `assets/models/<name>/` (or models path configured via existing CLI/env). - Sources: 1) Direct archive URL per model in vision.yml (e.g., `https://dl.photoprism.app/tensorflow/models/inception-v3-tensorflow2-classification-v2.tar.gz`), or 2) A repository base set via a flag or config (default could be https://dl.photoprism.app/tensorflow/models/), combined with a model-specific filename convention. - Validation: support optional checksum verification (e.g., SHA256) per model or via companion files; refuse to use corrupted downloads. - Idempotency: skip download if target folder exists with a completion marker and (optional) checksum matches; provide --force to re-download. #### Proposed Flags (new) - `--models name1,name2` Only pull the specified models; overrides the full list from vision.yml. - `--dry-run` Print planned actions without performing any changes; always exit 0. - `--parallel N` Limit concurrent downloads (default 2–4; tuneable). - `--timeout 5m` Per-download timeout. - `--retries N` Retry transient failures N times with backoff. - `--ollama-uri URL` Override Ollama base URI auto-detected from vision.yml. - `--tf-repo URL` Base URL for TensorFlow models when no direct archive URL is given (default can be https://dl.photoprism.app/tensorflow/models/). - `--tf-strip-components N` Strip leading path components during extraction (tar-like). - `--checksum SHA256` Optional checksum override (also supported per-model in vision.yml). - `--json` Emit machine-readable output (one object per model). - `--force` Re-download and overwrite existing TensorFlow model folders. #### UX and Output - Human-friendly progress with one line per model and compact summary. - When pulling via Ollama, stream or summarize progress if available. - For TensorFlow archives, show download size, rate, and extraction steps. - Log actionable errors (e.g., `Ollama service not reachable at <uri>`, `checksum mismatch`, `insufficient disk space`). #### Config and Paths - Respect existing flags/env (e.g., `--vision-yaml`, `--models-path`) used by other vision subcommands so behavior is predictable. - Models path defaults remain unchanged; no mounts or ports are altered by this command. #### Security and Networking - Do not open or expose ports. Use local or private service URIs from `vision.yml` or the defaults. - Ensure that no credentials are logged. Redact URLs or tokens if they are present in the configuration. #### Backward Compatibility - No changes to existing behavior of vision ls / save / run. - Users may continue to pull via `docker compose exec ollama …` if preferred. #### Testing Plan (high level) - [ ] Add/update unit tests for parsing `vision.yml`, provider resolution, retry/backoff, checksum verification, and idempotency markers. - [ ] Perform integration tests with a small **Ollama** model to verify pull, idempotency, and error handling when the daemon stops or becomes unreachable. - [ ] Perform integration tests for **TensorFlow**: direct archive link, repo-based resolution, re-download with the `--force` flag, and handling of checksum mismatches. #### Documentation Updates - [ ] Update the [Caption Generation](https://docs.photoprism.app/developer-guide/vision/caption-generation/) docs to recommend using the `photoprism vision pull` command instead of executing the "pull" command in the Ollama container. - [ ] Expand the [Computer Vision Commands](https://docs.photoprism.app/developer-guide/vision/cli/) page to include usage, flags, examples, and troubleshooting instructions. ### Acceptance Criteria - [ ] The CLI exposes a new subcommand `photoprism vision pull` that appears in `photoprism vision --help`. - [ ] Running `photoprism vision pull` with no args reads `vision.yml` and attempts to pull all listed models. - [ ] Ollama models: - Given a reachable Ollama service, the command triggers a pull for each configured tag via the Ollama API and returns exit code 0 on success. - Re-running the command when models are already present completes quickly without error (idempotent). - When Ollama is unreachable, the command exits non-zero and prints a clear error suggesting how to set or override the URI. - [ ] TensorFlow models: - If a model entry specifies a direct archive URL, the archive is downloaded and extracted to assets/models/<name>/; a completion marker is written on success. - If only a model name is given, the command resolves the URL using the repository base (default or --tf-repo) and then downloads/extracts as above. - If the target folder already exists with a completion marker and optional checksum match, the command skips download unless --force is provided. - If a checksum is provided and does not match, the command fails with a non-zero exit code and a clear error. - [ ] Flags: - `--models` limits the operation to the specified subset. - `--dry-run` performs no network or filesystem changes and exits 0 after printing the plan. - `--parallel N` caps concurrent downloads to N. - `--ollama-uri` overrides the Ollama base URI used for pulls. - `--tf-repo` overrides the base used to resolve TensorFlow archive URLs when no direct URL is specified. - `--timeout` and `--retries` affect both Ollama-triggered pulls (as applicable) and TensorFlow HTTP downloads. - `--json` emits structured output that includes model name, provider, action (skipped/pulled/updated), status, and error (if any). - `--force` re-downloads TensorFlow models even when a completion marker exists. - [ ] Config/paths: - The command honors existing flags/env such as `--vision-yaml` and `--models-path` used by other vision subcommands. - TensorFlow models end up in `assets/models/<name>/` by default if no alternative models path is set. - [ ] Exit codes: - Exit 0 when all requested models are already present or successfully pulled. - Exit non-zero if any requested model fails to pull or validate. - [ ] Logging: - Progress and summary are printed in human-readable form. - Errors include actionable guidance (e.g., misconfigured URI, missing model, insufficient permissions, low disk space). - [ ] Tests and documentation: - Unit and integration tests cover the behaviors above, including idempotency and error paths. - Documentation is updated in both caption-generation and Vision CLI pages, with examples and troubleshooting notes.

deekerman added the

help wanted

config

cli

labels

2026-02-20 01:10:59 -05:00

deekerman commented

2026-02-20 01:11:00 -05:00

Author

Owner

@longnguyen-tech commented on GitHub (Dec 1, 2025):

Hi, Michael,
This is Long Mau Nguyen, I am super interested in this project.
This is my portfolio.
https://portfolio-long-lemon.vercel.app/

I am looking forward to hearing from you.
Best Regards,
Long

@longnguyen-tech commented on GitHub (Dec 1, 2025): Hi, Michael, This is Long Mau Nguyen, I am super interested in this project. This is my portfolio. https://portfolio-long-lemon.vercel.app/ I am looking forward to hearing from you. Best Regards, Long

deekerman commented

2026-02-20 01:11:00 -05:00

Author

Owner

@alexislefebvre commented on GitHub (Dec 10, 2025):

Is it still necessary?

I saw this call in the Ollama logs:

2025-12-10T12:33:14.516322842Z [GIN] 2025/12/10 - 12:33:14 | 200 |          1m3s |       127.0.0.1 | POST     "/api/pull"

It looks like it would trigger the pull, so that docker compose exec ollama ollama pull qwen3-vl:4b-instruct or photoprism vision pull won’t be necessary?

Update: nevermind, I deleted the models and started docker compose exec photoprism photoprism vision run -m labels --count 1 --force, the same log didn’t appear. So I don’t know what happened.

@alexislefebvre commented on GitHub (Dec 10, 2025): Is it still necessary? I saw this call in the Ollama logs: ``` 2025-12-10T12:33:14.516322842Z [GIN] 2025/12/10 - 12:33:14 | 200 | 1m3s | 127.0.0.1 | POST "/api/pull" ``` It looks like it would trigger the pull, so that `docker compose exec ollama ollama pull qwen3-vl:4b-instruct` or `photoprism vision pull` won’t be necessary? --- Update: nevermind, I deleted the models and started `docker compose exec photoprism photoprism vision run -m labels --count 1 --force`, the same log didn’t appear. So I don’t know what happened.