AI: Improve Facial Recognition #2226

Open
opened 2026-02-20 01:08:16 -05:00 by deekerman · 8 comments
Owner

Originally created by @graciousgrey on GitHub (Dec 16, 2024).

As a user of PhotoPrism's face recognition feature, I would like to see a more capable model and vector database being used to improve performance and accuracy of matches.

Since the face detection and recognition functionality was first implemented in PhotoPrism, there has been a lot of progress in both models and database technology, for example:

Related Issues:

Originally created by @graciousgrey on GitHub (Dec 16, 2024). **As a user of PhotoPrism's face recognition feature, I would like to see a more capable model and vector database being used to improve performance and accuracy of matches.** Since the face detection and recognition functionality was first implemented in PhotoPrism, there has been a lot of progress in both models and database technology, for example: - https://mariadb.com/kb/en/vector-functions/ Related Issues: - https://github.com/photoprism/photoprism/issues/3124 - https://github.com/photoprism/photoprism/issues/4328 - https://github.com/photoprism/photoprism/issues/1587 - https://github.com/photoprism/photoprism/issues/5167
Author
Owner

@theshadow27 commented on GitHub (Jan 2, 2025):

Questions for discussion/refinement:

  1. Database Support: MariaDB only, or also sqlite?
  2. Backwards Compatibility: All-in (add columns/indexes to existing tables) or incremental (new tables with FK to old tables, e.g. markers_v)? Will this be a 'major' release that requires users to upgrade to the latest DB/install extensions in order to move to?
  3. Scope of enhancement: Drop in to replace current match with vector distance search preserving all other logic, or redesign to leverage HNSW indexes with more advanced methods (HDBSCAN, temporal clustering, iterative learning, remove minimization/optimization). Incorporate a time dimension or metric to assist with children/aging?
  4. Architecture Reevaluation: Keep re-implementing algorithms in Go, or gRPC to a python container dedicated to face recognition, clustering, and matching? This could render the first 3 considerations moot - all existing code remains, with feature flag to use the bypass it when the service is present.
@theshadow27 commented on GitHub (Jan 2, 2025): Questions for discussion/refinement: 1) **Database Support:** [MariaDB](https://mariadb.com/kb/en/vector-functions/) only, or also [sqlite](https://github.com/asg017/sqlite-vec)? 2) **Backwards Compatibility:** All-in (add columns/indexes to existing tables) or incremental (new tables with FK to old tables, e.g. `markers_v`)? Will this be a 'major' release that requires users to upgrade to the latest DB/install extensions in order to move to? 3) **Scope of enhancement:** Drop in to replace current `match` with vector distance search preserving all other logic, or redesign to leverage HNSW indexes with more advanced methods (HDBSCAN, temporal clustering, iterative learning, remove minimization/optimization). Incorporate a time dimension or metric to assist with children/aging? 4) **Architecture Reevaluation:** Keep re-implementing algorithms in Go, or gRPC to a python container dedicated to face recognition, clustering, and matching? This could render the first 3 considerations moot - all existing code remains, with feature flag to use the bypass it when the service is present.
Author
Owner

@lastzero commented on GitHub (Jan 13, 2025):

@theshadow27 Thanks for your questions!

  1. Since the vector features are only available with the latest/upcoming version of MariaDB, we will/should keep the current implementation for SQLite until it eventually offers the same or similar functionality (I believe there are patches in the works, but feel free to do your own research and report back).
  2. As for determining the best upgrade/migration strategy, that would be part of this issue, i.e. we need to first test for feasibility and then work out a plan before implementing the final solution. We could use a different/new database table to store the face embeddings if the same one cannot/should not be used, e.g. for performance reasons. However, adding a second column with the native vector type (or replacing the current one), if the database is a supported MariaDB version, seems easiest to implement.
  3. As I understand the MariaDB documentation, there's only one VECTOR type and INDEX to choose from at the moment. Also, we shouldn't make things too complex, but rather do small, interactive improvements where possible. You are welcome to make suggestions for additional/future improvements though.
  4. Ideally, we can avoid the need for a microservice architecture with a dedicated Python service for core functionality like face recognition - especially if our database or a native Go library can provide the same functionality without adding a ton of extra dependencies and requiring more expertise to set up and run our software.
@lastzero commented on GitHub (Jan 13, 2025): @theshadow27 Thanks for your questions! 1. Since the vector features are only available with the latest/upcoming version of MariaDB, we will/should keep the current implementation for SQLite until it eventually offers the same or similar functionality (I believe there are patches in the works, but feel free to do your own research and report back). 2. As for determining the best upgrade/migration strategy, that would be part of this issue, i.e. we need to first test for feasibility and then work out a plan before implementing the final solution. We could use a different/new database table to store the face embeddings if the same one cannot/should not be used, e.g. for performance reasons. However, adding a second column with the native vector type (or replacing the current one), if the database is a supported MariaDB version, seems easiest to implement. 3. As I understand the MariaDB documentation, there's only one VECTOR type and INDEX to choose from at the moment. Also, we shouldn't make things too complex, but rather do small, interactive improvements where possible. You are welcome to make suggestions for additional/future improvements though. 4. Ideally, we can avoid the need for a microservice architecture with a dedicated Python service for core functionality like face recognition - especially if our database or a native Go library can provide the same functionality without adding a ton of extra dependencies and requiring more expertise to set up and run our software.
Author
Owner

@LiquidDegu commented on GitHub (Apr 29, 2025):

To be honest I think the face detection algorithm is a bigger problem than the recognition one. Would it be possible to work with this a liddle bit like the further ai models, where we could have a API and then force the snippets for the recognition back to the tenserflow algorithm?

@LiquidDegu commented on GitHub (Apr 29, 2025): To be honest I think the face detection algorithm is a bigger problem than the recognition one. Would it be possible to work with this a liddle bit like the further ai models, where we could have a API and then force the snippets for the recognition back to the tenserflow algorithm?
Author
Owner

@theshadow27 commented on GitHub (Jul 2, 2025):

To be honest I think the face detection algorithm is a bigger problem than the recognition one. Would it be possible to work with this a liddle bit like the further ai models, where we could have a API and then force the snippets for the recognition back to the tenserflow algorithm?

While there is room for improvement in the face detection algorithm, the current clustering and brute force distance calculation approach is limiting on larger libraries and causing other issues (#3292) and blocking others (#1595, #1587). The multistep process of creating clusters and then assigning faces to them was an elegant solution before vector databases and GPU acceleration but is really showing it's age and needs to be replaced. For example, it's basically impossible with the current approach to use time, location, or even IMG_XXXX sequence number improve recognition accuracy because the marker has to be translated to a face (cluster) before it can be matched to a subject. The centroid of the face also shifts constantly, which can be highly problematic in parent-child or similar-sibling situations.

Not saying that the face detection doesn't need improvement, just keep the concerns separate. But I would suggest implementing this (vector-search) update before improving the detection; if suddenly 25% more faces are detected with the current algorithm, it would be very problematic, at least for me - I have 93k markers, so the full scan now takes 12+ hours even on high-end hardware (dual xeon, 48c, 256gb). Just my $0.02

@theshadow27 commented on GitHub (Jul 2, 2025): > To be honest I think the face detection algorithm is a bigger problem than the recognition one. Would it be possible to work with this a liddle bit like the further ai models, where we could have a API and then force the snippets for the recognition back to the tenserflow algorithm? While there is room for improvement in the face detection algorithm, the current clustering and brute force distance calculation approach is limiting on larger libraries and causing other issues (#3292) and blocking others (#1595, #1587). The multistep process of creating clusters and then assigning faces to them was an elegant solution before vector databases and GPU acceleration but is really showing it's age and needs to be replaced. For example, it's basically impossible with the current approach to use time, location, or even IMG_XXXX sequence number improve recognition accuracy because the marker has to be translated to a face (cluster) before it can be matched to a subject. The centroid of the face also shifts constantly, which can be highly problematic in parent-child or similar-sibling situations. Not saying that the face detection doesn't need improvement, just keep the concerns separate. But I would suggest implementing this (vector-search) update **before** improving the detection; if suddenly 25% more faces are detected with the current algorithm, it would be very problematic, at least for me - I have 93k markers, so the full scan now takes 12+ hours even on high-end hardware (dual xeon, 48c, 256gb). Just my $0.02
Author
Owner

@dror3go commented on GitHub (Dec 4, 2025):

After upgrading to the latest and greatest, I've decided to reset the faces un my library and start from scratch - since I had some issues with the existing face tags.

The new version, using the default face conconfiguration, while being much better at face detection, is less accurate on the clustering in my tests.
For example: my sister who appears in several hundreds of photos is not shown in the list of detected faces, while there are a few clusters containing different people: two friends who don't look alike (they both have a beard) share a cluster, and my dad and grandfather also share a single cluster.

I'm OK with sharing in private some relevant photos.

@dror3go commented on GitHub (Dec 4, 2025): After upgrading to the latest and greatest, I've decided to reset the faces un my library and start from scratch - since I had some issues with the existing face tags. The new version, using the default face conconfiguration, while being much better at face detection, is less accurate on the clustering in my tests. For example: my sister who appears in several hundreds of photos is not shown in the list of detected faces, while there are a few clusters containing different people: two friends who don't look alike (they both have a beard) share a cluster, and my dad and grandfather also share a single cluster. I'm OK with sharing in private some relevant photos.
Author
Owner

@lastzero commented on GitHub (Dec 7, 2025):

@dror3go In this release, we increased the PHOTOPRISM_FACE_CLUSTER_RADIUS from 0.35 to 0.42. It can now be configured as shown below. Since we also decreased the value of PHOTOPRISM_FACE_MATCH_DIST from 0.46 to 0.4, the maximum allowed distance for matching has increased slightly, from 0.77 to 0.42 + 0.4 = 0.82. This means that cluster sizes are more variable, and directly compared faces are less likely to match.

👉 Since all variables are exposed via configuration, you can revert to the original settings if they work better for you.

We expect additional matching improvements once we replace the FaceNet embeddings with a different model. However, this will render the existing embeddings obsolete. Since our update already included a long list of major changes, we decided to save this change for next year. Additionally, this will require a database with vector support, meaning users will need to update their MariaDB, which is another breaking change.

Previous Available Config Options & Defaults

Environment CLI Flag Default Description
PHOTOPRISM_FACE_SIZE --face-size 50 minimum size of faces in PIXELS (20-10000)
PHOTOPRISM_FACE_SCORE --face-score 9 minimum face QUALITY score (1-100)
PHOTOPRISM_FACE_OVERLAP --face-overlap 42 face area overlap threshold in PERCENT (1-100)
PHOTOPRISM_FACE_CLUSTER_SIZE --face-cluster-size 80 minimum size of automatically clustered faces in PIXELS (20-10000)
PHOTOPRISM_FACE_CLUSTER_SCORE --face-cluster-score 15 minimum QUALITY score of automatically clustered faces (1-100)
PHOTOPRISM_FACE_CLUSTER_CORE --face-cluster-core 4 NUMBER of faces forming a cluster core (1-100)
PHOTOPRISM_FACE_CLUSTER_DIST --face-cluster-dist 0.64 similarity DISTANCE of faces forming a cluster core (0.1-1.5)
PHOTOPRISM_FACE_MATCH_DIST --face-match-dist 0.46 similarity OFFSET for matching faces with existing clusters (0.1-1.5)

Config Options & Defaults in the New Release

Environment CLI Flag Default Description
PHOTOPRISM_FACE_ENGINE --face-engine auto face detection engine NAME (auto, pigo, onnx)
PHOTOPRISM_FACE_ENGINE_THREADS --face-engine-threads 0 face detection thread COUNT (0 uses half the available CPU cores)
PHOTOPRISM_FACE_SIZE --face-size 25 minimum size of faces in PIXELS (20-10000)
PHOTOPRISM_FACE_SCORE --face-score 9 minimum face QUALITY score (1-100)
PHOTOPRISM_FACE_ANGLE --face-angle -0.3, 0, 0.3 face detection ANGLE in radians (repeatable)
PHOTOPRISM_FACE_OVERLAP --face-overlap 42 face area overlap threshold in PERCENT (1-100)
PHOTOPRISM_FACE_CLUSTER_SIZE --face-cluster-size 60 minimum size of automatically clustered faces in PIXELS (20-10000)
PHOTOPRISM_FACE_CLUSTER_SCORE --face-cluster-score 20 minimum QUALITY score of automatically clustered faces (1-100)
PHOTOPRISM_FACE_CLUSTER_CORE --face-cluster-core 4 NUMBER of faces forming a cluster core (1-100)
PHOTOPRISM_FACE_CLUSTER_DIST --face-cluster-dist 0.64 similarity DISTANCE of faces forming a cluster core (0.1-1.5)
PHOTOPRISM_FACE_CLUSTER_RADIUS --face-cluster-radius 0.42 maximum cluster RADIUS accepted for automatic matches (0.1-1.5)
PHOTOPRISM_FACE_COLLISION_DIST --face-collision-dist 0.05 minimum collision discrimination DISTANCE (0.01-1)
PHOTOPRISM_FACE_EPSILON_DIST --face-epsilon-dist 0.01 collision tolerance DELTA appended to max match distances (0.001-0.1)
PHOTOPRISM_FACE_MATCH_DIST --face-match-dist 0.4 similarity OFFSET for matching faces with existing clusters (0.1-1.5)
PHOTOPRISM_FACE_SKIP_CHILDREN --face-skip-children skips automatic matching of child face embeddings
PHOTOPRISM_FACE_ALLOW_BACKGROUND --face-allow-background allows matching of probable background embeddings
@lastzero commented on GitHub (Dec 7, 2025): @dror3go In this release, we increased the `PHOTOPRISM_FACE_CLUSTER_RADIUS` from 0.35 to 0.42. It can now be configured as shown below. Since we also decreased the value of `PHOTOPRISM_FACE_MATCH_DIST` from 0.46 to 0.4, the maximum allowed distance for matching has increased slightly, from 0.77 to 0.42 + 0.4 = 0.82. This means that cluster sizes are more variable, and directly compared faces are less likely to match. 👉 Since all variables are exposed via configuration, you can revert to the original settings if they work better for you. We expect **additional matching improvements** once we replace the FaceNet embeddings with a different model. However, this will render the existing embeddings obsolete. Since our update already included a long list of major changes, we decided to save this change for next year. Additionally, this will require a database with vector support, meaning users will need to update their MariaDB, which is another breaking change. ### Previous Available Config Options & Defaults | Environment | CLI Flag | Default | Description | |:------------------------------|:---------------------|:--------|:------------------------------------------------------------------------| | PHOTOPRISM_FACE_SIZE | --face-size | 50 | minimum size of faces in `PIXELS` (20-10000) | | PHOTOPRISM_FACE_SCORE | --face-score | 9 | minimum face `QUALITY` score (1-100) | | PHOTOPRISM_FACE_OVERLAP | --face-overlap | 42 | face area overlap threshold in `PERCENT` (1-100) | | PHOTOPRISM_FACE_CLUSTER_SIZE | --face-cluster-size | 80 | minimum size of automatically clustered faces in `PIXELS` (20-10000) | | PHOTOPRISM_FACE_CLUSTER_SCORE | --face-cluster-score | 15 | minimum `QUALITY` score of automatically clustered faces (1-100) | | PHOTOPRISM_FACE_CLUSTER_CORE | --face-cluster-core | 4 | `NUMBER` of faces forming a cluster core (1-100) | | PHOTOPRISM_FACE_CLUSTER_DIST | --face-cluster-dist | 0.64 | similarity `DISTANCE` of faces forming a cluster core (0.1-1.5) | | PHOTOPRISM_FACE_MATCH_DIST | --face-match-dist | 0.46 | similarity `OFFSET` for matching faces with existing clusters (0.1-1.5) | ### Config Options & Defaults in the New Release | Environment | CLI Flag | Default | Description | |:---------------------------------|:------------------------|:-------------|:------------------------------------------------------------------------| | PHOTOPRISM_FACE_ENGINE | --face-engine | auto | face detection engine `NAME` (auto, pigo, onnx) | | PHOTOPRISM_FACE_ENGINE_THREADS | --face-engine-threads | 0 | face detection thread `COUNT` (0 uses half the available CPU cores) | | PHOTOPRISM_FACE_SIZE | --face-size | 25 | minimum size of faces in `PIXELS` (20-10000) | | PHOTOPRISM_FACE_SCORE | --face-score | 9 | minimum face `QUALITY` score (1-100) | | PHOTOPRISM_FACE_ANGLE | --face-angle | -0.3, 0, 0.3 | face detection `ANGLE` in radians (repeatable) | | PHOTOPRISM_FACE_OVERLAP | --face-overlap | 42 | face area overlap threshold in `PERCENT` (1-100) | | PHOTOPRISM_FACE_CLUSTER_SIZE | --face-cluster-size | 60 | minimum size of automatically clustered faces in `PIXELS` (20-10000) | | PHOTOPRISM_FACE_CLUSTER_SCORE | --face-cluster-score | 20 | minimum `QUALITY` score of automatically clustered faces (1-100) | | PHOTOPRISM_FACE_CLUSTER_CORE | --face-cluster-core | 4 | `NUMBER` of faces forming a cluster core (1-100) | | PHOTOPRISM_FACE_CLUSTER_DIST | --face-cluster-dist | 0.64 | similarity `DISTANCE` of faces forming a cluster core (0.1-1.5) | | PHOTOPRISM_FACE_CLUSTER_RADIUS | --face-cluster-radius | 0.42 | maximum cluster `RADIUS` accepted for automatic matches (0.1-1.5) | | PHOTOPRISM_FACE_COLLISION_DIST | --face-collision-dist | 0.05 | minimum collision discrimination `DISTANCE` (0.01-1) | | PHOTOPRISM_FACE_EPSILON_DIST | --face-epsilon-dist | 0.01 | collision tolerance `DELTA` appended to max match distances (0.001-0.1) | | PHOTOPRISM_FACE_MATCH_DIST | --face-match-dist | 0.4 | similarity `OFFSET` for matching faces with existing clusters (0.1-1.5) | | PHOTOPRISM_FACE_SKIP_CHILDREN | --face-skip-children | | skips automatic matching of child face embeddings | | PHOTOPRISM_FACE_ALLOW_BACKGROUND | --face-allow-background | | allows matching of probable background embeddings |
Author
Owner

@ptr727 commented on GitHub (Dec 14, 2025):

Question, is it possible to use an external AI vendor for face recognition, similar to e.g. the vision service? I'd gladly pay Google, OpenAI, etc. to get excellent face recognition of people that is currently poorly supported, specifically children, asian features, and pets. As is Immich does much better, but it lags in other areas.

@ptr727 commented on GitHub (Dec 14, 2025): Question, is it possible to use an external AI vendor for face recognition, similar to e.g. the vision service? I'd gladly pay Google, OpenAI, etc. to get excellent face recognition of people that is currently poorly supported, specifically children, asian features, and pets. As is Immich does much better, but it lags in other areas.
Author
Owner

@lastzero commented on GitHub (Dec 14, 2025):

We expect matching to improve once we replace the FaceNet embeddings with a different model. However, this will render the existing embeddings obsolete. Since our update already included a long list of major changes, we decided to save this change for next year. Additionally, this will require a database with vector support, meaning users will need to update their MariaDB, which introduces another breaking change. For more details, please see my comment above.

Note that manual face tagging is also in development and will be available in an upcoming release. Any help with testing will be much appreciated once we have merged this PR:

@lastzero commented on GitHub (Dec 14, 2025): We expect matching to improve once we replace the FaceNet embeddings with a different model. However, this will render the existing embeddings obsolete. Since our update already included a long list of major changes, we decided to save this change for next year. Additionally, this will require a database with vector support, meaning users will need to update their MariaDB, which introduces another breaking change. For more details, please see my comment above. Note that manual face tagging is also in development and will be available in an upcoming release. Any help with testing will be much appreciated once we have merged this PR: - https://github.com/photoprism/photoprism/pull/5081
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/photoprism#2226
No description provided.