starred/photoprism

Fork 0

mirror of https://github.com/photoprism/photoprism.git synced 2026-03-02 22:57:18 -05:00

Thoughts on object identification labeling #297

New issue

Closed

opened 2026-02-19 23:05:52 -05:00 by deekerman · 5 comments

deekerman commented

2026-02-19 23:05:52 -05:00

Owner

Originally created by @wiwie on GitHub (Jun 14, 2020).

Originally assigned to: @lastzero on GitHub.

Maybe it would make a lot of sense, to split this issue up. I just wasn't sure whether some of the aspects have been answered elsewhere, and I just missed it. Hence, I wanted to discuss some points first, before creating any more potential "noise" in the issue tracker.

Object identification / labeling performs poorly for some labels (e.g. baby):

My picture library contains a fair amount of baby, child, and family pictures. While I would estimate that there are at least 5k pictures containing babies, only less than 100 of them are labeled as such. It seems that some of them containing a frontal shot of a baby face are labeled as "Portrait" instead. But many do not contain any label related to baby, child or person.

Too many moments

Also, a lot of pictures are labeled as "Moment" - more than I would expect and quite a few of them I really wouldn't call a moment. It's a bit puzzling to me, what the moment label is supposed to express (in theory). I can't see a pattern behind the pictures that the label is assigned to.

Mislabelings for everyday sceneries

Overall, there are many mislabelings, while those seem to occur mostly on the pictures that show everyday situations, and not clear/exceptional things like an ape or a zebra. Has the model maybe not been trained with these kind of personal pictures? Sorry if it has been described elsewhere, then I missed it.

No multi-labeling?

Also, so far I couldn't find any picture that is assigned more than one label of identified objects - is this prohibited by design? I think this would be quite a limitation. I have some pictures in my collection, that are labeled as one thing, which is correct, but the more obvious one is missing.

Originally created by @wiwie on GitHub (Jun 14, 2020). Originally assigned to: @lastzero on GitHub. Maybe it would make a lot of sense, to split this issue up. I just wasn't sure whether some of the aspects have been answered elsewhere, and I just missed it. Hence, I wanted to discuss some points first, before creating any more potential "noise" in the issue tracker. ### Object identification / labeling performs poorly for some labels (e.g. baby): My picture library contains a fair amount of baby, child, and family pictures. While I would estimate that there are at least 5k pictures containing babies, only less than 100 of them are labeled as such. It seems that some of them containing a frontal shot of a baby face are labeled as "Portrait" instead. But many do not contain any label related to baby, child or person. ### Too many moments Also, a lot of pictures are labeled as "Moment" - more than I would expect and quite a few of them I really wouldn't call a moment. It's a bit puzzling to me, what the moment label is supposed to express (in theory). I can't see a pattern behind the pictures that the label is assigned to. ### Mislabelings for everyday sceneries Overall, there are many mislabelings, while those seem to occur mostly on the pictures that show everyday situations, and not clear/exceptional things like an ape or a zebra. Has the model maybe not been trained with these kind of personal pictures? Sorry if it has been described elsewhere, then I missed it. ### No multi-labeling? Also, so far I couldn't find any picture that is assigned more than one label of identified objects - is this prohibited by design? I think this would be quite a limitation. I have some pictures in my collection, that are labeled as one thing, which is correct, but the more obvious one is missing.

deekerman

2026-02-19 23:05:52 -05:00

closed this issue
added the
question
label

deekerman commented

2026-02-19 23:05:53 -05:00

Author

Owner

@lastzero commented on GitHub (Jun 15, 2020):

First of all, thanks for being a sponsor :)

Maybe it would make a lot of sense, to split this issue up. I just wasn't sure whether some of the aspects have been answered elsewhere, and I just missed it. Hence, I wanted to discuss some points first, before creating any more potential "noise" in the issue tracker.

Obviously they have been discussed at various places, but I'll give you a summary. We're currently working on our user guide which will (hopefully) answer many questions in the future.

Object identification / labeling performs poorly for some labels (e.g. baby):

We sadly don't have many personal photos of babies, we still did our best to find a probability threshold that works for "baby" (and all other labels). After getting a lot of negative feedback for false positives, we decided to increase thresholds even if that means not to find some objects. It's always a tradeoff and you can spend many weeks, even years, finding the best settings and the best model for image classification.

Note that we're using a mobile network that has less resolution than other models, but is faster and smaller in size (20 vs 600 MB). We could zoom in to identify smaller objects / babies but that reduces indexing performance, so we might get negative feedback for that too. Working on consumer software for a wide range of use cases is hard.

Too many moments

The label originally was "swing". It did poorly for that, but we always saw people enjoying a "moment" so we decided to go for that. If you don't like a label or find it useless, you can delete it.

Mislabelings for everyday sceneries

It's a general object classification model, not specialized on sceneries. That's on our todo, see #175.

See https://github.com/photoprism/photoprism/wiki/Image-Classification for details regarding the technical implementation.

No multi-labeling?

Of course you can have as many labels for photos and videos as you like. Our demo contains several examples:

@lastzero commented on GitHub (Jun 15, 2020): First of all, thanks for being a sponsor :) > Maybe it would make a lot of sense, to split this issue up. I just wasn't sure whether some of the aspects have been answered elsewhere, and I just missed it. Hence, I wanted to discuss some points first, before creating any more potential "noise" in the issue tracker. Obviously they have been discussed at various places, but I'll give you a summary. We're currently working on our user guide which will (hopefully) answer many questions in the future. > Object identification / labeling performs poorly for some labels (e.g. baby): We sadly don't have many personal photos of babies, we still did our best to find a probability threshold that works for "baby" (and all other labels). After getting a lot of negative feedback for false positives, we decided to increase thresholds even if that means not to find some objects. It's always a tradeoff and you can spend many weeks, even years, finding the best settings and the best model for image classification. Note that we're using a mobile network that has less resolution than other models, but is faster and smaller in size (20 vs 600 MB). We could zoom in to identify smaller objects / babies but that reduces indexing performance, so we might get negative feedback for that too. Working on consumer software for a wide range of use cases is hard. > Too many moments The label originally was "swing". It did poorly for that, but we always saw people enjoying a "moment" so we decided to go for that. If you don't like a label or find it useless, you can delete it. > Mislabelings for everyday sceneries It's a general object classification model, not specialized on sceneries. That's on our todo, see #175. See https://github.com/photoprism/photoprism/wiki/Image-Classification for details regarding the technical implementation. > No multi-labeling? Of course you can have as many labels for photos and videos as you like. Our demo contains several examples: <img width="747" alt="Screenshot 2020-06-15 at 08 37 42" src="https://user-images.githubusercontent.com/301686/84625443-8cb98880-aee3-11ea-92ed-2103e3cafcdb.png">

deekerman commented

2026-02-19 23:05:53 -05:00

Author

Owner

@sam2kb commented on GitHub (Jun 18, 2020):

Can the model be improved by contributing the existing data? I've indexed a pretty big photo archive. Can that be of any help?

@sam2kb commented on GitHub (Jun 18, 2020): Can the model be improved by contributing the existing data? I've indexed a pretty big photo archive. Can that be of any help?

deekerman commented

2026-02-19 23:05:54 -05:00

Author

Owner

@lastzero commented on GitHub (Jun 18, 2020):

Distributed learning is something we can experiment with once our first release is out of the door... don't think there are commercial solutions for this problem yet.

@lastzero commented on GitHub (Jun 18, 2020): Distributed learning is something we can experiment with once our first release is out of the door... don't think there are commercial solutions for this problem yet.

deekerman commented

2026-02-19 23:05:54 -05:00

Author

Owner

@lastzero commented on GitHub (Jun 23, 2020):

Closing this since there was no more feedback. Please add new issues for specific feature requests or bugs :)

@lastzero commented on GitHub (Jun 23, 2020): Closing this since there was no more feedback. Please add new issues for specific feature requests or bugs :)

deekerman commented

2026-02-19 23:05:54 -05:00

Author

Owner

@wiwie commented on GitHub (Jun 28, 2020):

Sorry for not getting back to this issue earlier. Needed to wait until I had the time to wrap my head around your reply and the referenced issues. Everything you write makes sense, it's not trivial I get it.

If we forget about the babies though and ask the other way around:

Which labels should be working? Is it the ones in ?
Is it useful that we users provide feedback to FPs/FNs concerning these labels, possibly in #160?

Also, I think it would be a good thing to expose the threshold settings to the user in the UI. The advanced user that wants to tweak the labels knows best, which kinds of labels are relevant for him/her. So it would be cool if thresholds could be adapted in the UI directly. The normal user will just ignore them. Extra work in the UI though for possibly just a small fraction of the users, I know. Also, updating the thresholds would involve recompilation of the model? how long does that take?

@wiwie commented on GitHub (Jun 28, 2020): Sorry for not getting back to this issue earlier. Needed to wait until I had the time to wrap my head around your reply and the referenced issues. Everything you write makes sense, it's not trivial I get it. If we forget about the babies though and ask the other way around: - Which labels should be working? Is it the ones in ![here](https://github.com/photoprism/photoprism/blob/develop/internal/classify/rules.yml)? - Is it useful that we users provide feedback to FPs/FNs concerning these labels, possibly in #160? Also, I think it would be a good thing to expose the threshold settings to the user in the UI. The advanced user that wants to tweak the labels knows best, which kinds of labels are relevant for him/her. So it would be cool if thresholds could be adapted in the UI directly. The normal user will just ignore them. Extra work in the UI though for possibly just a small fraction of the users, I know. Also, updating the thresholds would involve recompilation of the model? how long does that take?