Labels: Allow a model to be trained for better results #371

Open
opened 2026-02-19 23:07:27 -05:00 by deekerman · 23 comments
Owner

Originally created by @derSoerrn95 on GitHub (Aug 7, 2020).

Hi, I’ve got a question: is it possible to train the current Tensorflow Model with my own images and tags? Is there a command to start training?

The current model is not really good. It always recognizes cats as dogs or pizza as plate. Would be really nice if the model could learn my tags.

Thanks for your help. :)

Originally created by @derSoerrn95 on GitHub (Aug 7, 2020). Hi, I’ve got a question: is it possible to train the current Tensorflow Model with my own images and tags? Is there a command to start training? The current model is not really good. It always recognizes cats as dogs or pizza as plate. Would be really nice if the model could learn my tags. Thanks for your help. :)
Author
Owner

@lastzero commented on GitHub (Aug 7, 2020):

See Developer Guide. It's not possible to train TensorFlow models using Go and I guess a few examples would also not lead to good results. Distributed learning might be a good approach, but then you need to deal with different user languages and privacy concerns. Also the hardware of most users won't be powerful enough for training models.

@lastzero commented on GitHub (Aug 7, 2020): See Developer Guide. It's not possible to train TensorFlow models using Go and I guess a few examples would also not lead to good results. Distributed learning might be a good approach, but then you need to deal with different user languages and privacy concerns. Also the hardware of most users won't be powerful enough for training models.
Author
Owner

@derSoerrn95 commented on GitHub (Aug 7, 2020):

Oh sorry, I read the documents page, but - I don't know why - didn't notice the metadata section. The problem is my library has over 50,000 photos and I'm not in the mood to tag all of them.

I am with you that there are many users who do not have powerful hardware. But what about a separate Docker-Container running TensorFlow Python that does the training and shares the result with the Go container?
In this way, the user can freely decide whether he/she would like to train.

@derSoerrn95 commented on GitHub (Aug 7, 2020): Oh sorry, I read the documents page, but - I don't know why - didn't notice the metadata section. The problem is my library has over 50,000 photos and I'm not in the mood to tag all of them. I am with you that there are many users who do not have powerful hardware. But what about a separate Docker-Container running TensorFlow Python that does the training and shares the result with the Go container? In this way, the user can freely decide whether he/she would like to train.
Author
Owner

@RAYs3T commented on GitHub (Aug 7, 2020):

@derSoerrn95 Wouldn't that just require an API endpoint which provides you the current assosiated lables for a specific picture (+the confedence level)?
The training container could then just grab the current labels from PP and re-evaluate the pictures with the newly (manual) added labels.

Only problem I see here is how you would merge you personal model with the public one?

Also regarding hardware, while yes a lot of people may run this on their Raspberry PI, there are also a few that actually have access to somewhat more powerful hardware. And if they have the option I'm sure there will be some that use it and are happy with it :)

I'm fairly new to the Tensorflow stuff, so I have no idea if "model merging" is a thing.

@RAYs3T commented on GitHub (Aug 7, 2020): @derSoerrn95 Wouldn't that just require an API endpoint which provides you the current assosiated lables for a specific picture (+the confedence level)? The training container could then just grab the current labels from PP and re-evaluate the pictures with the newly (manual) added labels. Only problem I see here is how you would merge you personal model with the public one? Also regarding hardware, while yes a lot of people may run this on their Raspberry PI, there are also a few that actually have access to somewhat more powerful hardware. And if they have the option I'm sure there will be some that use it and are happy with it :) I'm fairly new to the Tensorflow stuff, so I have no idea if "model merging" is a thing.
Author
Owner

@derSoerrn95 commented on GitHub (Aug 7, 2020):

@RAYs3T Yes, it can make sense to use an API endpoint. You would then have to load all tags for each image, decide which should be used and fill the input / output tensors with them.

It is also possible to load a model that has already been learned and then continue training. But this can lead to some side effects, because data from the original data set may contradict those that you added yourself. In this case it would be better to train a new model.

The new or updated one could then be stored on a shared volume between the containers and then loaded from the GO container.

But I'm not a TensorFlow pro either. I'm currently playing a bit with time series forecasting with RNNs and LSTMs. I've never worked with pictures.

@derSoerrn95 commented on GitHub (Aug 7, 2020): @RAYs3T Yes, it can make sense to use an API endpoint. You would then have to load all tags for each image, decide which should be used and fill the input / output tensors with them. It is also possible to load a model that has already been learned and then continue training. But this can lead to some side effects, because data from the original data set may contradict those that you added yourself. In this case it would be better to train a new model. The new or updated one could then be stored on a shared volume between the containers and then loaded from the GO container. But I'm not a TensorFlow pro either. I'm currently playing a bit with time series forecasting with RNNs and LSTMs. I've never worked with pictures.
Author
Owner

@lastzero commented on GitHub (Aug 7, 2020):

Sounds like it makes most sense to implement this as a separate app / server. We already expose an API that can be used and extended if necessary.

Might be worth looking for existing software for this use case so that only some glue code needs to be developed instead of reinventing the wheel.

@lastzero commented on GitHub (Aug 7, 2020): Sounds like it makes most sense to implement this as a separate app / server. We already expose an API that can be used and extended if necessary. Might be worth looking for existing software for this use case so that only some glue code needs to be developed instead of reinventing the wheel.
Author
Owner

@dekiesel commented on GitHub (Oct 19, 2020):

This library seems like a good fit, it is written in python though.

Users could manually tag a "few" pictures and then use those existing tags to train a model (using another container running wrapper code for this library) which is then used on the pictures that haven't been tagged.

The benefit of that approach is that every time a user corrects a matching the training data will improve.

The drawback is that it's another language than Go, which increases maintenance work.

@dekiesel commented on GitHub (Oct 19, 2020): [This library ](https://github.com/ageitgey/face_recognition) seems like a good fit, it is written in python though. Users could manually tag a "few" pictures and then use those existing tags to train a model (using another container running wrapper code for this library) which is then used on the pictures that haven't been tagged. The benefit of that approach is that every time a user corrects a matching the training data will improve. The drawback is that it's another language than Go, which increases maintenance work.
Author
Owner

@danielo515 commented on GitHub (Nov 6, 2020):

@dekiesel that library looks very promising.
I would very happily use it ad-hoc, by running it through my image library and using the results to feed the tags (or create new ones) that exists on photorprism. Would that be possible with the existing API @lastzero ? Or I will have more luck just modifying the DB directly (hope not, hahah)

@danielo515 commented on GitHub (Nov 6, 2020): @dekiesel that library looks very promising. I would very happily use it ad-hoc, by running it through my image library and using the results to feed the tags (or create new ones) that exists on photorprism. Would that be possible with the existing API @lastzero ? Or I will have more luck just modifying the DB directly (hope not, hahah)
Author
Owner

@lastzero commented on GitHub (Nov 8, 2020):

Model training should be done in a separately. It's beyond the scope of what we can maintain right now and also might require different programming languages like Python. The TensorFlow API for Go is not made for model training.

@lastzero commented on GitHub (Nov 8, 2020): Model training should be done in a separately. It's beyond the scope of what we can maintain right now and also might require different programming languages like Python. The TensorFlow API for Go is not made for model training.
Author
Owner

@danielo515 commented on GitHub (Nov 9, 2020):

I am playing with a little proof of concept to, at least, add face recognition from the outside. I'm having a lot of fun.
I'll report back if I come with something usable.
It is already open source in any case if someone wants to contribute or continue it in case I reach the point where I can not do it.

@danielo515 commented on GitHub (Nov 9, 2020): I am playing with a little proof of concept to, at least, add face recognition from the outside. I'm having a lot of fun. I'll report back if I come with something usable. It is already open source in any case if someone wants to contribute or continue it in case I reach the point where I can not do it.
Author
Owner

@kalon33 commented on GitHub (Sep 21, 2021):

@danielo515 Hi, any news from your work on this? Thanks :)

@kalon33 commented on GitHub (Sep 21, 2021): @danielo515 Hi, any news from your work on this? Thanks :)
Author
Owner

@lastzero commented on GitHub (Sep 21, 2021):

Face detection & recognition has been added now. To easily use additional models, it would make sense to use a standardized API designed for this purpose. First step would be to do a bit of research, e.g. figure out if that already exists or somebody is working on it. No need to reinvent the wheel.

@lastzero commented on GitHub (Sep 21, 2021): Face detection & recognition has been added now. To easily use additional models, it would make sense to use a standardized API designed for this purpose. First step would be to do a bit of research, e.g. figure out if that already exists or somebody is working on it. No need to reinvent the wheel.
Author
Owner

@danielo515 commented on GitHub (Sep 24, 2021):

@danielo515 Hi, any news from your work on this? Thanks :)

I have a MVP on my personal github projects. It is publicly available, I'll post a link later.
But if it has been official implemented maybe that project is not worth continuing. It depends on the limitations of the official implementation. Does it support tagging any people? If it does, then my project doesn't add anything to it

@danielo515 commented on GitHub (Sep 24, 2021): > @danielo515 Hi, any news from your work on this? Thanks :) I have a MVP on my personal github projects. It is publicly available, I'll post a link later. But if it has been official implemented maybe that project is not worth continuing. It depends on the limitations of the official implementation. Does it support tagging any people? If it does, then my project doesn't add anything to it
Author
Owner

@lastzero commented on GitHub (Sep 24, 2021):

Faces are only automatically detected for now as manually selecting faces was more work for us and our users. The backend could deal with it though, just need a nice UI. May be part of a custom image viewer.

@lastzero commented on GitHub (Sep 24, 2021): Faces are only automatically detected for now as manually selecting faces was more work for us and our users. The backend could deal with it though, just need a nice UI. May be part of a custom image viewer.
Author
Owner

@laurac8r commented on GitHub (Feb 8, 2022):

Any updates? I can work on retraining as I am an ML engineer by profession. Can someone link helpful docs for retraining? How is the Tensorflow model trained typically?

@laurac8r commented on GitHub (Feb 8, 2022): Any updates? I can work on retraining as I am an ML engineer by profession. Can someone link helpful docs for retraining? How is the Tensorflow model trained typically?
Author
Owner

@lastzero commented on GitHub (Feb 8, 2022):

@yarocoder It's just too much more right now. Keep in mind that we also write all the documentation and provide support for 50,000+ users. I think the first step would be to study the options and provide a decision matrix for discussion.

Technical details of the implementation are documented in the Developer Guide:

The public roadmap shows which features we are currently working on:

@lastzero commented on GitHub (Feb 8, 2022): @yarocoder It's just too much more right now. Keep in mind that we also write all the documentation and provide support for 50,000+ users. I think the first step would be to study the options and provide a decision matrix for discussion. Technical details of the implementation are documented in the Developer Guide: - https://docs.photoprism.app/developer-guide/metadata/classification/ The public roadmap shows which features we are currently working on: - https://github.com/photoprism/photoprism/projects/5
Author
Owner

@mateuszdrab commented on GitHub (Jun 16, 2022):

I too am interested in the concept of training up the model or being able to use another model, hoping it would work better.

@mateuszdrab commented on GitHub (Jun 16, 2022): I too am interested in the concept of training up the model or being able to use another model, hoping it would work better.
Author
Owner

@freman commented on GitHub (Jun 22, 2022):

One of the things that keeps me with google is it can find "license plate" or "Shrek" (my cat), I'm not opposed to training my own model, I've tagged about 300 photos... out of several thousand.

@freman commented on GitHub (Jun 22, 2022): One of the things that keeps me with google is it can find "license plate" or "Shrek" (my cat), I'm not opposed to training my own model, I've tagged about 300 photos... out of several thousand.
Author
Owner

@abviv commented on GitHub (Jan 6, 2023):

@laurac8r I like this issue since it's been open for so long and my area is ML for vision, so I think I can also contribute to this effectively with my expertise. As a starter, I would investigate one major thing: the research focused on answering the questions of computation requirements (typically on a CPU), data requirements (how much data do I need?), and accuracy (which goes without saying).

@abviv commented on GitHub (Jan 6, 2023): @laurac8r I like this issue since it's been open for so long and my area is ML for vision, so I think I can also contribute to this effectively with my expertise. As a starter, I would investigate one major thing: the research focused on answering the questions of computation requirements (typically on a CPU), data requirements (how much data do I need?), and accuracy (which goes without saying).
Author
Owner

@scarolan commented on GitHub (Aug 17, 2023):

+1 for allowing users to guide the model with suggestions. Not sure if this is even feasible but it would be amazing if you could crowdsource the human labor. Users could volunteer to submit their data and corrections to a central database which could be used to improve the experience for everyone.

Also please meet my pet 'Snail' Sunny. 🤣

snail

@scarolan commented on GitHub (Aug 17, 2023): +1 for allowing users to guide the model with suggestions. Not sure if this is even feasible but it would be amazing if you could crowdsource the human labor. Users could volunteer to submit their data and corrections to a central database which could be used to improve the experience for everyone. Also please meet my pet 'Snail' Sunny. 🤣 ![snail](https://github.com/photoprism/photoprism/assets/403332/07299012-29ba-4b62-b92b-3d1a63e9517b)
Author
Owner

@craiga commented on GitHub (Jul 15, 2024):

Sounds like it makes most sense to implement this as a separate app / server. We already expose an API that can be used and extended if necessary.

I've been thinking a bit about this, and I put together a Python script which uses AWS Rekognition or Ollama running locally to label images via the PhotoPrism API as a proof of concept.

https://gist.github.com/craiga/ac09b21908f7ddbab0bb9e899c8b07b2

Writing labels to Photoprism (which is running on my NAS) is very slow and running this script tends to overwhelm the server. It's still indexing images though, so that might have something to do with it.

Regardless, it'd be great to have an API endpoint where we could write multiple labels.

@craiga commented on GitHub (Jul 15, 2024): > Sounds like it makes most sense to implement this as a separate app / server. We already expose an API that can be used and extended if necessary. I've been thinking a bit about this, and I put together a Python script which uses AWS Rekognition or Ollama running locally to label images via the PhotoPrism API as a proof of concept. https://gist.github.com/craiga/ac09b21908f7ddbab0bb9e899c8b07b2 Writing labels to Photoprism (which is running on my NAS) is very slow and running this script tends to overwhelm the server. It's still indexing images though, so that might have something to do with it. Regardless, it'd be great to have an API endpoint where we could write multiple labels.
Author
Owner

@craiga commented on GitHub (Jul 25, 2024):

On thing I've noticed with my script is that it causes anything else happening in Photoprism (i.e. indexing) to slow to a crawl. To me it looks like something is causing an issue with MariaDB.

Screenshot 2024-07-25 at 14 17 53

The circled bit here is where I started my labelling script. You can see that Photoprism's CPU usage drops from ~280% to 0–200%, and MariaDB jumps from ~5% to 100%.

Is labelling via the API causing some kind of DB lock?

Everything is running on volume3 which is an SSD except for the original images. This is on a Synology DS920+, which has a four-core CPU. It has 20GB of RAM. photoprism-photoprism is set up with three index workers. photoprism-faces is an additional container I have running photoprism faces update in the background. I see the same behaviour when photoprism-faces isn't running.

@craiga commented on GitHub (Jul 25, 2024): On thing I've noticed with my script is that it causes anything else happening in Photoprism (i.e. indexing) to slow to a crawl. To me it looks like something is causing an issue with MariaDB. <img width="1453" alt="Screenshot 2024-07-25 at 14 17 53" src="https://github.com/user-attachments/assets/433905f4-cb6b-47a4-bb03-3088c56fdb63"> The circled bit here is where I started my labelling script. You can see that Photoprism's CPU usage drops from ~280% to 0–200%, and MariaDB jumps from ~5% to 100%. Is labelling via the API causing some kind of DB lock? Everything is running on volume3 which is an SSD except for the original images. This is on a Synology DS920+, which has a four-core CPU. It has 20GB of RAM. `photoprism-photoprism` is set up with three index workers. `photoprism-faces` is an additional container I have running `photoprism faces update` in the background. I see the same behaviour when `photoprism-faces` isn't running.
Author
Owner

@graciousgrey commented on GitHub (Jul 25, 2024):

@craiga Thank you for sharing your script!

We are currently working on a similar solution: https://floss.social/@photoprism/112792393790157931.

The idea is to have a separate Docker service that allows PhotoPrism to use additional models to generate labels and descriptions.

@graciousgrey commented on GitHub (Jul 25, 2024): @craiga Thank you for sharing your script! We are currently working on a similar solution: https://floss.social/@photoprism/112792393790157931. The idea is to have a separate Docker service that allows PhotoPrism to use additional models to generate labels and descriptions.
Author
Owner

@graciousgrey commented on GitHub (Dec 2, 2025):

You can now use our Ollama integration to generate labels or captions based on custom prompts:

@graciousgrey commented on GitHub (Dec 2, 2025): You can now use our Ollama integration to generate labels or captions based on custom prompts: - https://docs.photoprism.app/user-guide/ai/using-ollama/ - https://docs.photoprism.app/user-guide/ai/
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/photoprism#371
No description provided.