starred/photoprism

Fork 0

mirror of https://github.com/photoprism/photoprism.git synced 2026-03-02 22:57:18 -05:00

Proof-of-concept for scene category classification #144

New issue

Open

opened 2026-02-19 23:02:57 -05:00 by deekerman · 8 comments

deekerman commented

2026-02-19 23:02:57 -05:00

Owner

Originally created by @lastzero on GitHub (Dec 30, 2019).

As a PhotoPrism user, I want my photos classified by scene category so that I can filter search results by scene and get better image titles.

While we already label certain scenes based on the objects we find, we don't have a specialized TensorFlow model for this yet e.g. AlexNet-places365, GoogLeNet-places365 or ResNet152-places365. See also http://places2.csail.mit.edu/download.html.

Ideally we can reuse our existing Go TensorFlow code for this, but each model is different in how it must be used. Our NSFW detector for example needs different input values than our Nasnet model for object classification, so we ended up using different code and different packages. For scene detection, it might be good to create a new scene package unless merging it with Nasnet gives us much better performance.

Acceptance Criteria:

Identify and document freely available TensorFlow models and compare them by capability, performance and file size (you can use our developer wiki for this)
Identify and document working example implementations and helpful GitHub repos e.g. https://github.com/CSAILVision/places365
If you can code in Go, provide a working proof-of-concept as unit test (no UI required)

Originally created by @lastzero on GitHub (Dec 30, 2019). **As a PhotoPrism user, I want my photos classified by scene category so that I can filter search results by scene and get better image titles.** While we already label certain scenes based on the objects we find, we don't have a specialized TensorFlow model for this yet e.g. AlexNet-places365, GoogLeNet-places365 or ResNet152-places365. See also http://places2.csail.mit.edu/download.html. Ideally we can reuse our existing Go TensorFlow code for this, but each model is different in how it must be used. Our NSFW detector for example needs different input values than our Nasnet model for object classification, so we ended up using different code and different packages. For scene detection, it might be good to create a new `scene` package unless merging it with Nasnet gives us much better performance. Acceptance Criteria: - [ ] Identify and document freely available TensorFlow models and compare them by capability, performance and file size (you can use our developer [wiki](https://github.com/photoprism/photoprism/wiki) for this) - [ ] Identify and document working example implementations and helpful GitHub repos e.g. https://github.com/CSAILVision/places365 - [ ] If you can code in Go, provide a working proof-of-concept as unit test (no UI required)

deekerman added the

enhancement

help wanted

in-progress

labels

2026-02-19 23:02:57 -05:00

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@lastzero commented on GitHub (Jan 8, 2020):

Moved our image classification code to the new classify package: github.com/photoprism/photoprism@e9874d6e0c

Should be easier to test, simply go to the directory and run go test -v.

We'll see if that's a good name... was the best a could come up with today.

@lastzero commented on GitHub (Jan 8, 2020): Moved our image classification code to the new `classify` package: https://github.com/photoprism/photoprism/commit/e9874d6e0cfb81d2fce4e5be34455848254685d7 Should be easier to test, simply go to the directory and run `go test -v`. We'll see if that's a good name... was the best a could come up with today.

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@lastzero commented on GitHub (Jan 8, 2020):

Updated our docs: https://github.com/photoprism/photoprism/wiki/Image-Classification

@lastzero commented on GitHub (Jan 8, 2020): Updated our docs: https://github.com/photoprism/photoprism/wiki/Image-Classification

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@lastzero commented on GitHub (Jan 16, 2020):

FYI: https://github.com/nic25 is working on this 🚀

@lastzero commented on GitHub (Jan 16, 2020): FYI: https://github.com/nic25 is working on this 🚀

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@lastzero commented on GitHub (Jan 16, 2020):

Didn't find related models on TensorFlow Hub.

@lastzero commented on GitHub (Jan 16, 2020): Didn't find related models on [TensorFlow Hub](https://tfhub.dev/s?q=places365).

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@tam-wh commented on GitHub (May 23, 2020):

seems like there's a way to convert places365 caffemodel to tensorflow?
https://ndres.me/post/convert-caffe-to-tensorflow/

@tam-wh commented on GitHub (May 23, 2020): seems like there's a way to convert places365 caffemodel to tensorflow? https://ndres.me/post/convert-caffe-to-tensorflow/

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@tam-wh commented on GitHub (May 23, 2020):

i successfully converted vgg16_hybrid1365 into pb file (took me half a day to get the converter working) and it works well in Tensorflow .NET. I'm now working on the non-hybrid model as it is more suitable for scene classification

ResNet152-places365 (does not convert)
VGG16-hybrid1365 (converted successfully)
VGG16-places365 (working on it)

@tam-wh commented on GitHub (May 23, 2020): i successfully converted vgg16_hybrid1365 into pb file (took me half a day to get the converter working) and it works well in Tensorflow .NET. I'm now working on the non-hybrid model as it is more suitable for scene classification ResNet152-places365 (does not convert) VGG16-hybrid1365 (converted successfully) VGG16-places365 (working on it)

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@tam-wh commented on GitHub (May 23, 2020):

The file is quite huge, ~500MB. Now shared on my nextcloud VGG16-places365. Its quite slow as my vps download speed is limited to 150KB. Label files are available on places365 github

Some information on the model

Input operation name = "data";
Output operation name = "prob"

Took a couple of images from the demo here and run it through the converted pb file in Tensorflow.NET. Image width & height set to 224px.

48 /b/beach 48, 0.86291456

66 /b/bridge 66, 0.9400765

Seems like conversion works really well

@tam-wh commented on GitHub (May 23, 2020): The file is quite huge, ~500MB. Now shared on my nextcloud [VGG16-places365](https://cloud.oqozi.com/index.php/s/CAkbYKfcL92EAQa). Its quite slow as my vps download speed is limited to 150KB. Label files are available on [places365 github](https://github.com/CSAILVision/places365/blob/master/categories_places365.txt) Some information on the model ``` Input operation name = "data"; Output operation name = "prob" ``` Took a couple of images from the demo [here ](http://places2.csail.mit.edu/demo.html)and run it through the converted pb file in Tensorflow.NET. Image width & height set to 224px. ![IMG_7308](https://user-images.githubusercontent.com/3193896/82736077-675cb300-9d59-11ea-8919-755cbded341a.JPG) 48 /b/beach 48, 0.86291456 ![13](https://user-images.githubusercontent.com/3193896/82736104-98d57e80-9d59-11ea-9b0c-a2681e4d863f.jpg) 66 /b/bridge 66, 0.9400765 Seems like conversion works really well

deekerman commented

2026-02-19 23:02:58 -05:00

Author

Owner

@Extarys commented on GitHub (Sep 10, 2020):

Not sure if it's possible, but could the label "water" be added? Maybe "boat" for the second picture?

I don't really know how those things work though 😕

@Extarys commented on GitHub (Sep 10, 2020): Not sure if it's possible, but could the label "water" be added? Maybe "boat" for the second picture? I don't really know how those things work though :confused:

deekerman referenced this issue

2026-02-20 00:09:29 -05:00

People: Sort autocomplete list of people based on metadata (when assigning to face) #1273