mirror of
https://github.com/photoprism/photoprism.git
synced 2026-03-02 22:57:18 -05:00
AI: Integrate Florence 2 Vision AI for Auto Caption #2129
Labels
No labels
ai
android
api
auth
awesome
bug
bug
ci
cli
config
database
declined
deprecated
docker
docs 📚
documents
duplicate
easy
enhancement
enhancement
enhancement
epic
faces
feedback wanted
frontend
hacktoberfest
help wanted
idea
in-progress
incomplete
index
invalid
ios
labels
live
live
low-priority
macos
member-feature
metadata
mobile
nas
needs-analysis
no-coding-required
no-coding-required
observability
performance
places
please-test
plus-feature
priority
pro-feature
question
raspberry-pi
raw
released
released
released
research
resolved
security
sharing
tested
tests
third-party-issue
thumbnails
upgrade
upstream-issue
ux
vector
video
waiting
won't fix
won't fix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/photoprism#2129
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @david-ng-hk on GitHub (Jun 26, 2024).
Microsoft release a new AI model recently, MIT license.
It will use Vision AI to auto generate a brief or detail caption for a photo or video.
I think this AI can generate the caption for the search in photo prism.
Please integrate, if not all GPU, then NVidia card go first.
Thank you.
model download
https://huggingface.co/microsoft/Florence-2-large-ft
demo
https://huggingface.co/spaces/SixOpen/Florence-2-large-ft
@GlassedSilver commented on GitHub (Jun 27, 2024):
Played around with the demo for a bit and what can I say, seems pretty darn good!
MIT license isn't too shabby either.
@graciousgrey commented on GitHub (Dec 2, 2025):
Although there is no support for Florence 2 Vision, you can now use our Ollama integration to generate labels or captions: