Metrics: Add OpenTelemetry (OTLP) integration #2455

Open
opened 2026-02-20 01:11:46 -05:00 by deekerman · 0 comments
Owner

Originally created by @lastzero on GitHub (Dec 11, 2025).

Originally assigned to: @lastzero on GitHub.

As a professional user, I want PhotoPrism Pro to support the export of traces, metrics and logs via OpenTelemetry (for example, to SigNoz or any other OTLP-compatible backend), so that I can monitor and troubleshoot team and organisational deployments within a unified observability stack.

1. Scope / Signals

  • Primary focus: Traces + logs (metrics remain available via existing Prometheus endpoint).
  • Optional: Native OTLP metrics export for environments that prefer push-based metrics.
  • Backend: Any OTLP-compatible backend, with SigNoz as the main documented example.

2. Dependencies & Libraries

Add / promote the following Go dependencies (SDK + contrib):

  • OpenTelemetry Go API/SDK (tracing, metrics, logs)
  • OTLP exporters (HTTP and/or gRPC)
  • Go instrumentation helpers for:
    • HTTP server (PhotoPrism REST API)
    • database/sql / ORM
    • Logging bridge, if logs are sent via OTLP instead of file/stdout

(See links below for concrete packages and docs.)

3. Configuration

Introduce a small, Pro-only config surface for OpenTelemetry:

  • Enablement

    • Global enable/disable OpenTelemetry.
    • Per-signal toggles: traces, metrics, logs.
  • OTLP endpoint & protocol

    • OTLP base endpoint (usually an OpenTelemetry Collector or SigNoz endpoint).
    • Protocol selection: HTTP vs gRPC.
    • TLS options (insecure flag, custom CA) and optional headers (for SigNoz tokens, etc.).
  • Identity / resource attributes

    • Service name (photoprism), environment (prod, staging, etc.), version.
    • Instance/node identifiers for multi-node Pro deployments.
  • Sampling & limits

    • Trace sampler configuration (e.g. parent/always-on/percentage).
    • Reasonable defaults for span/event/attribute limits.
  • Logs

    • Log level threshold for export (e.g. warn+ only).
    • Mode:
      • “Sidecar mode”: keep stdout/file logging; logs are collected by OTel Collector.
      • “Direct OTLP mode”: send logs via OTLP (optionally also keep stdout).

Existing Prometheus metrics endpoint stays supported for backward compatibility.

4. Implementation Proposal

  1. Initialization

    • On startup, if OpenTelemetry is enabled:
      • Create TracerProvider, MeterProvider, and optionally LoggerProvider configured with:
        • OTLP exporter (HTTP/gRPC) to the configured endpoint.
        • Resource with service/env/version attributes.
      • Configure global otel.SetTracerProvider, otel.SetMeterProvider, etc.
  2. HTTP / API layer

    • Wrap HTTP handlers / router with OpenTelemetry middleware so that:
      • Each API request is a span.
      • Spans include route, status code, user/account (where appropriate), and error info.
  3. Indexing & import pipeline

    • Create explicit spans for:
      • Full indexing/import jobs.
      • Child spans for EXIF extraction, metadata parsing, face detection, video previews, etc.
    • Attach attributes such as file.count, duration, error.count, storage.backend.
  4. Database interactions

    • Instrument database/sql or ORM calls with spans:
      • Capture query duration and key attributes (e.g. table, operation).
      • Use sampling/limits to avoid excessive span cardinality.
  5. Background jobs

    • Wrap scheduled tasks (e.g., maintenance, cleanup, thumbnails) in spans with status and runtime.
  6. Logs correlation

    • If OTLP logs are enabled:
      • Ensure logs are enriched with trace_id/span_id so that SigNoz can correlate logs ↔ traces.
    • If using sidecar/Collector log ingestion:
      • Document recommended log format so users can configure parsers in the Collector.
  7. Docs & examples (Pro)

    • Add a Pro KB/Docs page showing:
      • How to enable OpenTelemetry in PhotoPrism Pro.
      • Example SigNoz/Collector configuration.
      • Recommended defaults for sampling and log levels.

Libraries, Documentation & Examples

OpenTelemetry Go SDK:
https://github.com/open-telemetry/opentelemetry-go
https://opentelemetry.io/docs/languages/go/

OTLP exporters and configuration:
https://opentelemetry.io/docs/languages/go/exporters/
https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/
https://github.com/open-telemetry/opentelemetry-proto
https://opentelemetry.io/docs/specs/otlp/

Go instrumentation helpers:
https://github.com/open-telemetry/opentelemetry-go-contrib
https://opentelemetry.io/docs/languages/go/instrumentation/

SigNoz + OpenTelemetry:
https://signoz.io/docs/instrumentation/
https://signoz.io/docs/instrumentation/opentelemetry-golang/

Background / examples:
https://emdneto.github.io/opentelemetry-by-example/go/
https://betterstack.com/community/guides/observability/otlp/


Acceptance Criteria

  • When OpenTelemetry is enabled in PhotoPrism Pro, it exports traces (and optionally logs/metrics) via OTLP to a configurable endpoint (e.g. SigNoz / OTel Collector).
  • When OpenTelemetry is disabled, PhotoPrism runs without initializing OTel exporters/providers and behaves exactly as before.
  • HTTP API requests are traced as spans (with route, method, status, error state) and are visible in SigNoz grouped by endpoint.
  • Indexing/import operations and key pipeline steps (e.g. EXIF/metadata, face detection, previews) emit spans so that slow or failing stages can be identified in SigNoz.
  • Database operations are instrumented so that slow queries appear as child spans of the corresponding API or job traces.
  • Background jobs (maintenance, cleanup, reindex, etc.) emit spans with status and duration, clearly distinguishable from API request traces.
  • If OTLP log export is enabled, logs include trace/span IDs so log entries can be correlated with traces in SigNoz; if disabled, existing stdout/file logging continues to work unchanged.
  • Documentation includes a short “How to enable OpenTelemetry (SigNoz)” section with example configuration and a minimal OTel Collector / SigNoz setup.
Originally created by @lastzero on GitHub (Dec 11, 2025). Originally assigned to: @lastzero on GitHub. **As a professional user, I want PhotoPrism Pro to support the export of traces, metrics and logs via OpenTelemetry (for example, to SigNoz or any other OTLP-compatible backend), so that I can monitor and troubleshoot team and organisational deployments within a unified observability stack.** #### 1. Scope / Signals - **Primary focus:** Traces + logs (metrics remain available via existing Prometheus endpoint). - **Optional:** Native OTLP metrics export for environments that prefer push-based metrics. - **Backend:** Any OTLP-compatible backend, with SigNoz as the main documented example. #### 2. Dependencies & Libraries Add / promote the following Go dependencies (SDK + contrib): - OpenTelemetry Go API/SDK (tracing, metrics, logs) - OTLP exporters (HTTP and/or gRPC) - Go instrumentation helpers for: - HTTP server (PhotoPrism REST API) - `database/sql` / ORM - Logging bridge, if logs are sent via OTLP instead of file/stdout (See links below for concrete packages and docs.) #### 3. Configuration Introduce a small, Pro-only config surface for OpenTelemetry: - **Enablement** - Global enable/disable OpenTelemetry. - Per-signal toggles: traces, metrics, logs. - **OTLP endpoint & protocol** - OTLP base endpoint (usually an OpenTelemetry Collector or SigNoz endpoint). - Protocol selection: HTTP vs gRPC. - TLS options (insecure flag, custom CA) and optional headers (for SigNoz tokens, etc.). - **Identity / resource attributes** - Service name (`photoprism`), environment (`prod`, `staging`, etc.), version. - Instance/node identifiers for multi-node Pro deployments. - **Sampling & limits** - Trace sampler configuration (e.g. parent/always-on/percentage). - Reasonable defaults for span/event/attribute limits. - **Logs** - Log level threshold for export (e.g. warn+ only). - Mode: - “Sidecar mode”: keep stdout/file logging; logs are collected by OTel Collector. - “Direct OTLP mode”: send logs via OTLP (optionally also keep stdout). Existing Prometheus metrics endpoint stays supported for backward compatibility. #### 4. Implementation Proposal 1. **Initialization** - On startup, if OpenTelemetry is enabled: - Create `TracerProvider`, `MeterProvider`, and optionally `LoggerProvider` configured with: - OTLP exporter (HTTP/gRPC) to the configured endpoint. - Resource with service/env/version attributes. - Configure global `otel.SetTracerProvider`, `otel.SetMeterProvider`, etc. 2. **HTTP / API layer** - Wrap HTTP handlers / router with OpenTelemetry middleware so that: - Each API request is a span. - Spans include route, status code, user/account (where appropriate), and error info. 3. **Indexing & import pipeline** - Create explicit spans for: - Full indexing/import jobs. - Child spans for EXIF extraction, metadata parsing, face detection, video previews, etc. - Attach attributes such as `file.count`, `duration`, `error.count`, `storage.backend`. 4. **Database interactions** - Instrument `database/sql` or ORM calls with spans: - Capture query duration and key attributes (e.g. table, operation). - Use sampling/limits to avoid excessive span cardinality. 5. **Background jobs** - Wrap scheduled tasks (e.g., maintenance, cleanup, thumbnails) in spans with status and runtime. 6. **Logs correlation** - If OTLP logs are enabled: - Ensure logs are enriched with `trace_id`/`span_id` so that SigNoz can correlate logs ↔ traces. - If using sidecar/Collector log ingestion: - Document recommended log format so users can configure parsers in the Collector. 7. **Docs & examples (Pro)** - Add a Pro KB/Docs page showing: - How to enable OpenTelemetry in PhotoPrism Pro. - Example SigNoz/Collector configuration. - Recommended defaults for sampling and log levels. --- ### Libraries, Documentation & Examples OpenTelemetry Go SDK: https://github.com/open-telemetry/opentelemetry-go https://opentelemetry.io/docs/languages/go/ OTLP exporters and configuration: https://opentelemetry.io/docs/languages/go/exporters/ https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/ https://github.com/open-telemetry/opentelemetry-proto https://opentelemetry.io/docs/specs/otlp/ Go instrumentation helpers: https://github.com/open-telemetry/opentelemetry-go-contrib https://opentelemetry.io/docs/languages/go/instrumentation/ SigNoz + OpenTelemetry: https://signoz.io/docs/instrumentation/ https://signoz.io/docs/instrumentation/opentelemetry-golang/ Background / examples: https://emdneto.github.io/opentelemetry-by-example/go/ https://betterstack.com/community/guides/observability/otlp/ --- ### Acceptance Criteria - [ ] When OpenTelemetry is enabled in PhotoPrism Pro, it exports traces (and optionally logs/metrics) via OTLP to a configurable endpoint (e.g. SigNoz / OTel Collector). - [ ] When OpenTelemetry is disabled, PhotoPrism runs without initializing OTel exporters/providers and behaves exactly as before. - [ ] HTTP API requests are traced as spans (with route, method, status, error state) and are visible in SigNoz grouped by endpoint. - [ ] Indexing/import operations and key pipeline steps (e.g. EXIF/metadata, face detection, previews) emit spans so that slow or failing stages can be identified in SigNoz. - [ ] Database operations are instrumented so that slow queries appear as child spans of the corresponding API or job traces. - [ ] Background jobs (maintenance, cleanup, reindex, etc.) emit spans with status and duration, clearly distinguishable from API request traces. - [ ] If OTLP log export is enabled, logs include trace/span IDs so log entries can be correlated with traces in SigNoz; if disabled, existing stdout/file logging continues to work unchanged. - [ ] Documentation includes a short “How to enable OpenTelemetry (SigNoz)” section with example configuration and a minimal OTel Collector / SigNoz setup.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/photoprism#2455
No description provided.