mirror of
https://github.com/photoprism/photoprism.git
synced 2026-03-02 22:57:18 -05:00
Config: Add support for PostgreSQL #45
Labels
No labels
ai
android
api
auth
awesome
bug
bug
ci
cli
config
database
declined
deprecated
docker
docs 📚
documents
duplicate
easy
enhancement
enhancement
enhancement
epic
faces
feedback wanted
frontend
hacktoberfest
help wanted
idea
in-progress
incomplete
index
invalid
ios
labels
live
live
low-priority
macos
member-feature
metadata
mobile
nas
needs-analysis
no-coding-required
no-coding-required
observability
performance
places
please-test
plus-feature
priority
pro-feature
question
raspberry-pi
raw
released
released
released
research
resolved
security
sharing
tested
tests
third-party-issue
thumbnails
upgrade
upstream-issue
ux
vector
video
waiting
won't fix
won't fix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/photoprism#45
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @sokoow on GitHub (Oct 24, 2018).
Originally assigned to: @keif888 on GitHub.
Nice idea lads, I totally support it. Were youever wondering to switch to postgres ? For the deployment size I'm predicting this going to have, mysql might be a bit suboptimal choice :)
Details on possible implementation strategies can be found in this comment:
@lastzero commented on GitHub (Oct 24, 2018):
Not right now, but in general: anything to store a few tables will do... as simple and stable as possible... many developers are familiar with mysql, so that's my default when I start a new project. Tooling is also good.
sqlite is a very lean option, but obviously - if you run multiple processes or want to directly access / backup your data - it doesn't scale well or at all.
@lastzero commented on GitHub (Nov 16, 2018):
It became clear that we have to build a single binary for distribution to reach broad adoption. Differences between SQL dialects are too large to have them abstracted away by our current ORM library, for example when doing date range queries. They are already different between MySQL and sqlite.
For those reasons we will not implement Postgres support for our MVP / first release. If you have time & energy, you are welcome to help us. I will close this issue for now, we can revisit it later when there is time and enough people want this 👍
@sokoow commented on GitHub (Nov 17, 2018):
ok fair point - I was raising this because cost of maintenance and troubleshooting at scale is much lower with postgres, and lots of succesfull projects have this support. so, from what you wrote about differences, it seems that you don't have pluggable orm-like generic read/write storage methods just yet, right ?
@lastzero commented on GitHub (Nov 17, 2018):
@sokoow We do use GORM, but it doesn't help with search queries that use database specific SQL.
If you like to dive into the subject,
DATEDIFFis a great example: MySQL and SQL Server useDATEDIFF(), Postgres seems to preferDATE_PART()whereas sqlite only hasjulianday().It goes even deeper when you look into how tables are organized. You can't abstract and optimize at the same time. We want to provide the best performance to our users.
See
@sokoow commented on GitHub (Nov 17, 2018):
No that's a fair point, you're not the first project that has this challenge - something to think about on higher abstraction level.
@LeKovr commented on GitHub (Nov 17, 2018):
I guess it won't be so hard, so I would try
@lastzero commented on GitHub (Nov 17, 2018):
Getting it to work somehow at a single point in time is not hard, getting it to work with decent performance, finding developers who are comfortable with it and constantly maintaining the code is incredibly hard.
Keep in mind: You also need to maintain continuous integration infrastructure and effectively run all tests with every database.
@LeKovr commented on GitHub (Nov 20, 2018):
Ofcourse, tests might be same for every supported database and this might be solved within #60.
Also, sqlite support will probably entail some architectural changes (like search using Bleve and db driver dependent sql queries). It won't be hard to add postgresql support after that. And may be you'll find "developers who are comfortable with it" by this time
@lastzero commented on GitHub (Nov 20, 2018):
@LeKovr Did you see how we renamed the label from "rejected" to "descoped"? 😉 Yes indeed, later it might be great to add support for additional databases, if users actually need it in practice. Maybe everyone will be happy with an embedded database if we do it well. It is hard to predict.
What I meant was that if you change some code that involves SQL you might feel uncomfortable because you only have experience with one database, so you end up doing nothing. And that can be very dangerous for a project.
@LeKovr commented on GitHub (Nov 20, 2018):
@lastzero, You are right. May be later. There are more important things to do by now
@bobobo1618 commented on GitHub (Jul 9, 2020):
I had a quick look and it looks like the queries at least are trivial to add. The biggest problem is the models. The
varbinaryanddatetimetypes are hard-coded into the models but don't exist in PostgreSQL, so the migration fails.I'm not sure what the solution is here. I'd guess that the solution is to use the types Gorm expects (e.g.
[]byteinstead ofstringwhen you want a column filled with bytes) but there's probably a good reason why it wasn't done that way to start with.I'll play with it some more and see. It'd be nice to put everything in my PostgreSQL DB instead of SQLite.
@LeKovr commented on GitHub (Jul 9, 2020):
may be
create domain varbinary...may helps@bobobo1618 commented on GitHub (Jul 9, 2020):
All of the
varbinaryhave different lengths and seem to have different purposes, so I don't think that'll help unfortunately.@lastzero commented on GitHub (Jul 9, 2020):
Yes, we use binary for plain ASCII, especially when strings need to be sorted, indexed or compared and should not be normalized in any way.
@bobobo1618 commented on GitHub (Jul 9, 2020):
Shouldn't that be the case by default for string fields? I know MySQL does some stupid stuff with character encodings but it shouldn't modify plain ASCII, right?
@lastzero commented on GitHub (Jul 9, 2020):
But it uses 4 BYTES per ASCII character, so the index becomes very big. Also when you compare strings, it's somewhat more complex with unicode than just to compare bytes. I'm aware you can PROBABLY do the same with VARCHAR with the right settings and enough time to test, but it was hard to see business value in such experiments.
@bobobo1618 commented on GitHub (Jul 9, 2020):
As far as I can tell looking at the SQLite docs, the MySQL docs and PostgreSQL docs, that isn't the case at all. A
varcharuses a 1-4 byte prefix depending on the size of the field but each byte of payload consumes one byte of storage.But we're not storing unicode, we're storing ASCII in a field that could contain unicode. I don't think any of those edge-cases apply here.
Fair enough.
Also, queries aren't so straightforward after all. The queries extensively use
0and1instead offalseandtrue, which isn't supported by PostgreSQL (and as a side note, makes the query more difficult to read, since you don't know if it's meant to be a boolean comparison or an integer comparison).I managed to do a little bit of cleanup of that and managed to get something working at least.
@lastzero commented on GitHub (Jul 9, 2020):
Not in the index, check again. Maybe also not in memory when comparing.
@bobobo1618 commented on GitHub (Jul 9, 2020):
I couldn't find documentation so I just ran a quick test to see.
Which resulted in 79.3k of actual data:
I analyzed it with
sqlite3_analyzer.Table:
Index:
So for 79288 bytes of actual data sitting in the column, we have 109288 bytes total for the data itself (1.38 bytes per byte) and 129160 for the index (1.63 bytes per byte).
I repeated the test with
varbinary(32)instead ofvarchar(32)and got precisely the same result, down to the exact number of bytes.So I don't see any evidence that a
varcharconsumes more space in an index than avarbinary.@lastzero commented on GitHub (Jul 9, 2020):
You'll find some information on this page: https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-conversion.html
You might also want to read this and related RFCs: https://en.wikipedia.org/wiki/Comparison_of_Unicode_encodings
Note that Microsoft, as far as I know, still uses UCS-2 instead of UTC-8 in Windows, for all the reasons I mentioned. Maybe they switched to UTF-16. Their Linux database driver for SQL Server used null terminated strings, guess how well this works with UCS-2. Not at all.
For MySQL, we use 4 byte UTF8, which needs 4 bytes in indexes unless somebody completely refactored InnoDB in the meantime. Note that the MySQL manual was wrong on InnoDB for a long time, insisting that MySQL doesn't know or support indexed organized tables while InnoDB ONLY uses index organized tables.
When you're done with this, enjoy learning about the four Unicode normalization forms: https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization
Did you know there's a difference between Linux und OS X? Apple uses decomposed, so you need to convert all strings when copying files. Their bundled command line tools were not compiled with iconv support, so you had to compile it yourself. Some of this still not fixed until today.
@lastzero commented on GitHub (Jul 9, 2020):
Note that Sqlite ignores VARBINARY and probably also VARCHAR to some degree. It uses dynamic typing. That's why all string keys are prefixed with at least once non-numeric character. It would convert the value to INT otherwise and comparisons with binary data or strings would fail:
See https://www.sqlite.org/datatype3.html
@bobobo1618 commented on GitHub (Jul 9, 2020):
I'm aware of Unicode encodings and some of the important differences between them. I still don't see anything in the docs indicating that using a
varcharcontaining ASCII will consume 4 bytes in an index but I'll take your word for it.To be clear, in case there's some miscommunication going on, my assumption is that even if the column is switched to
varchar, plain ASCII (i.e. the first 128 unicode code points, which are all encoded with 8 bits) will still be stored in it. That being the case, 1 character = 1 byte and comparisons are bog-standard string comparisons.In other news, here's a PoC of PostgreSQL mostly working. It's intended as an overview of the work that needs to be done, not as a serious proposal.
@bobobo1618 commented on GitHub (Jul 9, 2020):
Actually on
stringvs.[]byteit occurred to me, if you only want to store ASCII here and don't want to treat this like a thing that's semantically like a string, is it a bad thing to use a[]bytetype? Is it the hassle of converting to/from strings when dealing with other APIs that's offputting?With
[]byte, gorm will choose an appropriate type for each DB by default.@bobobo1618 commented on GitHub (Jul 9, 2020):
Ah, looks like the
stringvs.[]byteis mostly solved by gorm V2 anyhow. You'll just be able to puttype:bytesin the tag and it'll handle it for you.@lastzero commented on GitHub (Jul 9, 2020):
See https://mathiasbynens.be/notes/mysql-utf8mb4
Maybe we can switch to []byte in Go. Let's revisit this later, there are a ton of items on our todo with higher priority and that's by far not the only change we need to support other databases.
Edit: As you can see in the code, I already implemented basic support for multiple dialects when we added Sqlite. For Postgres there's more to consider, especially data types. Sqlite pretty much doesn't care. Bool columns and date functions might also need attention. I'm fully aware Postgres is very popular in the GIS community, so it will be worth adding when we have the resources needed to implement and maintain it (see next comment).
@lastzero commented on GitHub (Jul 9, 2020):
We also need to consider the impact on testing and continuous integration when adding support for additional storage engines and APIs. That's often underestimated and causes permanent overhead. From a contributor's perspective, it might just be a one time pull request. Anyhow, we value your efforts and feedback! Just so that you see why we're cautious.
@skorokithakis commented on GitHub (Dec 31, 2020):
Would you consider leaving this issue open so people can 👍 it? I almost didn't find it.
@lastzero commented on GitHub (Dec 31, 2020):
Note that only Gorm 2 aka 1.2 supports compatible general types. I already tried upgrading, but it turned out to be extremely time consuming and tedious due to many changes. We decided to rather release earlier and without bugs, than to go for Postgres support in our first release.
If you look at our public roadmap, you'll notice that there are a ton of important feature requests that deliver value to regular users. So don't expect us to completely refactor our storage layer in the next few months 👍
If we get it done earlier, that's good... but no commitment at this point. As this is the core of our app, we can also not "just" merge a pull request. There are too many edge cases to keep in mind.
@skorokithakis commented on GitHub (Dec 31, 2020):
Fair enough. Most of my friends use Postgres (as do I), and I don't think I know anyone who prefers MariaDB, but SQLite is a good second choice so I just went with that.
@lastzero commented on GitHub (Dec 31, 2020):
We initially started with a built-in TiDB server. When that caused issues, we simply sticked with MySQL-compatible databases. SQLite is not that much different, and ignores most data types anyway. MariaDB works well for what we do. Migrating from a more to a less powerful DBMS is much more difficult when your app depends on all the features. Note that SQLite is much slower than MariaDB in every regard, especially on powerful servers with many logical cores due to file based locking. We don't want to get any bad performance reviews! 😉
@skorokithakis commented on GitHub (Dec 31, 2020):
Haha, don't worry, my server is plenty slow already so that's what I blame :P It's just too bad that using an ORM compatible with all three RDBMSes is hard to do at this point, but I agree, features are more important.
@G2G2G2G commented on GitHub (Jun 6, 2021):
@bobobo1618 datetime has existed in postgresql for ~10 years and varbinary = bit datatype
...sqlite has no locking for reads, only writes. Scales very well for multi-user, many core, read heavy systems. Just not writing. Also all datatypes in sqlite are treated the same. How it is inserted is how the specific "cell" treats the data, this is all over the docs.
@bobobo1618 commented on GitHub (Jun 6, 2021):
I think you misunderstood. Of course PostgreSQL has equivalent types (
timestampfordatetime,bit varying(n)forvarbinary), the problem is that PhotoPrism hard-codes the specific namesdatetimeandvarbinary, which are not understood by PostgreSQL.@LeKovr commented on GitHub (Jun 6, 2021):
?
@myxor commented on GitHub (Jan 19, 2022):
Are there any news to PostgreSQL support?
@graciousgrey commented on GitHub (Jan 19, 2022):
No, we are currently working on multi-user support, which is really an epic.
You can find a list of upcoming features on our roadmap: https://github.com/photoprism/photoprism/projects/5
@davralin commented on GitHub (Jan 19, 2022):
Not sure if there's a place to mention it - or if it's really a new issue - but how to migrate between databases would also be nice in addition to "just" supporting postgresql.
@lastzero commented on GitHub (Jan 19, 2022):
@francisco1844 commented on GitHub (Mar 24, 2022):
Is there a place where people can put money towards a particular feature? I think that would help to see how much existing / future users value a particular feature. Also, for many people it may be more appealing towards a specific feature than just to make a donation and hope that eventually the feature they need will make it.
@francisco1844 commented on GitHub (Mar 24, 2022):
Don't see Postgresql in the Roadmap, or is it under generic name for other DBs support?
@graciousgrey commented on GitHub (Mar 24, 2022):
Here you find an overview of our current funding options. Sponsors in higher tiers can give golden sponsor labels to features.
Not anymore. While we like IssueHunt and are grateful for the donations we've received so far, it hasn't proven to be a sustainable funding option for us as we spend much of our time maintaining existing features and providing support.
If we don't have enough resources to provide support and bugfixes, we can't start working on new features.
@lastzero commented on GitHub (Mar 26, 2022):
That's because we plan to support PostgreSQL anyway, ideally when there is less pressure to release new features than right now. We can't perform a major backend/database refactoring while pushing huge new features like multi-user support.
@dradux commented on GitHub (Jun 3, 2022):
I would love to see postgres support! I'll contribute time, talent, and/or treasure.
@vyruss commented on GitHub (Mar 23, 2023):
I can also contribute Postgres knowledge & time.
@pashagolub commented on GitHub (May 3, 2023):
I can help you with PostgreSQL support.
@lastzero commented on GitHub (May 30, 2023):
@pashagolub My apologies for not getting back to you sooner! We had to focus all our resources on the release and then needed a break. Any help with adding PostgreSQL is, of course, much appreciated. There are two basic strategies:
Should you decide to tackle this, I'm happy to help and give advice to the best of my ability. Also, if you have any personal questions, feel free to contact me directly via email so as to avoid notifying all issue subscribers on GitHub about a new comment.
@Tragen commented on GitHub (May 30, 2023):
The strategy should be doing 1 and then 2. ;)
But after 1 there is often no reason for 2.
@lastzero commented on GitHub (May 30, 2023):
I often wish we had a more compatible, intuitive database abstraction layer. But compared to important core features that are still missing, like batch editing, this is not a big pain point at the moment and therefore not a top priority.
@pashagolub commented on GitHub (May 30, 2023):
Would you please name it? :) It's hard to find the name in the .mod file without actually knowing it :-)
Speaking about
go.mod... I was surprised to seelib/pqdependency. :-D@lastzero commented on GitHub (May 30, 2023):
We currently use GORM v1, I was assuming this is mentioned/discussed in the comments above: https://v1.gorm.io/docs/
@pashagolub commented on GitHub (May 30, 2023):
Sorry. Missed that
@rustygreen commented on GitHub (Sep 8, 2023):
Any update on when we can expect PostgreSQL support?
@lastzero commented on GitHub (Sep 9, 2023):
We had several contributors who wanted to work on this. However, there is no pull request for it yet and so I can't tell you anything about the progress.
@fl0wm0ti0n commented on GitHub (Sep 30, 2023):
any news for postgres support?
@ezra-varady commented on GitHub (Nov 2, 2023):
Are there any contributors working on this atm? My team is interested in this feature, and I might be able to contribute some time
@lastzero commented on GitHub (Nov 4, 2023):
@ezra-varady We appreciate any help we can get! To reiterate what I wrote above, there are two possible strategies:
internal/configpackage, though it should be tested as a proof-of-concept before you invest a lot of time.Due to the higher chances of success (and because it doesn't block us from upgrading later), I would personally recommend going for (1), i.e. adding (a) manual migrations (for the initial setup of the database schema in the first step) and (b) hand-written SQL for custom queries for which the ORM is not used, for example:
github.com/photoprism/photoprism@539e18d984/internal/query/covers.go (L27-L49)Should you decide to tackle this, we will be happy to help and provide advice to the best of our ability. You are also welcome to contact us via email or chat if you have general questions that don't need to be documented as a public issue comment on GitHub.
@vnnv commented on GitHub (Nov 24, 2023):
@lastzero did you considered the option to remove GORM at all and replace it with something else? Perhaps a lightweight lib for db access? something similar to github.com/jmoiron/sqlx ?
@stavros-k commented on GitHub (Nov 24, 2023):
Maybe https://entgo.io/ or https://bun.uptrace.dev/
@pashagolub commented on GitHub (Nov 24, 2023):
I think
pgxis enough for most of the functionality. But again if we want to be able to talk to different databases, we should come with some kind of database engine. And ORM is not the best choice, because the problem is not in the relation-object mapping but in the logic behind@lastzero commented on GitHub (Nov 24, 2023):
@vnnv @stavros-k @pashagolub Yes, of course we have also considered switching to a completely different library... There are many more choices now than when we started the project.
That said, some kind of abstraction seems necessary if we want to support multiple dialects with the resources we have. Also, I think it's a good idea to cover simple standard use cases instead of creating every single SQL query manually.
Either way, the amount of work required to switch to a different library would be even greater than what I described in my comment above as 2.: https://github.com/photoprism/photoprism/issues/47#issuecomment-1793392875
Even for 1. and 2. it seems extremely difficult to find contributors with the time and experience required, and my personal time is very limited due to the amount of support and feature requests we receive.
@stavros-k commented on GitHub (Nov 24, 2023):
What kind of abstraction are you looking for? I saw that is regarding column types.
Do you have columns that you dont know the type before hand?
If you do know it before hand but its "changing" frequently, Ent might be a better option, as you can extend the generated code with some gotemplates. As for migrations I would look into Atlas.
That being said, I was just subscribed into this issue for a long time, and thought I'd share what I found from my recent search for a db lib. As I was looking to start a mini side project.
I wish I had the experience to help with it.
@ai2ys commented on GitHub (Jun 18, 2024):
PostgreSQL support would be great, as it offers support for JSON.
@satonotdead commented on GitHub (Jun 19, 2024):
Is that already implemented? There are postgres docker-compose.yml example files on the main repo.
@lastzero commented on GitHub (Jun 24, 2024):
Adding PostgreSQL support unfortunately depends on the help of contributors, as many users are waiting for OpenID Connect support and improved facial recognition, which I need to take care of. So there is no way I can spend several weeks working on other issues or even half the time to e.g. get a proof-of-concept ready for release:
Keep in mind that while we always do our best to give general advice and feedback (which is not easy considering our workload), refactoring the "associations" and "preload" functionality for the migration from Gorm v1 to v2 seems to require quite a bit of time to experiment and come up with a good solution. So some advice from us and a few tests will probably not be enough to get it done.
For alternative solutions that may not require an upgrade to Gorm v2, see the following comment:
If you have questions, please feel free to ask in our Contributors and/or Community Chat!
@alan-cugler commented on GitHub (Jan 27, 2025):
Good Afternoon! I see PR #4560 is still marked as a draft with lots of progress made back in October. Did the PhotoPrism team feel significant progress was made by that outside help on PR #4560 to migrate to GORMv2? I was pretty excited by the effort but wasnt sure if it should be taken seriously since it was still marked as "help wanted" on the project road map.
just looking for an update, even if the update is nothing new. cheers!
@keif888 commented on GitHub (Feb 24, 2025):
PR #4560 has now made it out of draft, and hopefully bug free.
To support PostgreSQL will require a docker configuration for PostgreSQL, configuration files for PostgreSQL, and some SQL changes within PhotoPrism where SQL incompatibilities between vendors occur.
I have never configured a docker container, or used PostgreSQL, so it will be a learning curve to get out of the way before I can work on integrating PhotoPrism with PostgreSQL.
@pashagolub commented on GitHub (Feb 25, 2025):
Thanks for your hard work @keif888. Would it be enough if I will tune this one compose.postgres.yaml?
@keif888 commented on GitHub (Feb 25, 2025):
I have got one up and going in my branch for PostgreSQL.
https://github.com/keif888/photoprism/tree/PostgreSQL
Now to get keycloak to use the correct schema on startup instead of Public.
@pashagolub commented on GitHub (Feb 26, 2025):
@keif888 I can help you with that. What is keycloak and what schema and under what role it should be operated in the database?
@lastzero commented on GitHub (Feb 26, 2025):
Keycloak is a third-party service for testing authentication with OpenID Connect, so it should not need to be migrated to PostgreSQL (it is used through a web API).
Note that we have already started working on a Docker Compose configuration for testing and development with PostgreSQL as index database:
@pashagolub commented on GitHub (Feb 26, 2025):
Yes, that's why I was surprised why are we talking about keycloak schema if we suppose to run it as a separate service and just provide the proper database name to operate
@keif888 commented on GitHub (Feb 26, 2025):
I am working on an assumption that everything that PhotoPrism's compose.yaml does with MariaDB has to be done in PostgreSQL. So I updated the compose.postgres.yaml to be the "same" as the compose.yaml. ie. it has all the same services.
I have Keycloak working in PostgreSQL now, same as it does in MariaDB.
My init script for PostgreSQL was incorrect, which I have corrected.
Current status is:
PhotoPrism starts, you can upload a photo, and it is uploaded. But after that you can't do much as a number of the SQL statements are failing as MariaDB syntax doesn't work well with PostgreSQL.
Failing unit tests are documented here:
https://github.com/keif888/photoprism/blob/PostgreSQL/teststatus.md
@lastzero commented on GitHub (Feb 26, 2025):
If it's custom SQL from us, then you should be able to find most of the related code (that needs to be extended) by searching for
switch DbDialect(), for example:github.com/photoprism/photoprism@4a4e45eb59/internal/entity/query/covers.go (L27-L49)@keif888 commented on GitHub (Feb 26, 2025):
I've just fixed that one (just committed it.) by rewriting the SQL so that it is ANSI standard, and works against all three DBMS'.
@lastzero commented on GitHub (Feb 26, 2025):
@keif888 I appreciate your work, but please be careful with this: If there are different queries, these MAY do different things depending on the database capabilities, e.g. select the first image vs. select the image with the highest resolution or quality. Also, it's possible that the "compatible" query for SQLite may not perform well on MariaDB. So even if it works, there could be unwanted side effects.
@keif888 commented on GitHub (Feb 26, 2025):
@lastzero I have reverted to individual SQL statements per DBMS to avoid the unintended consequences, and will continue to maintain that where it already exists.
Fortunately I had only completed 4 with consolidated SQL statements. The 5th one required separate statements as PostgreSQL can NOT do a MAX on a bytea column.
@lastzero commented on GitHub (Feb 26, 2025):
@keif888 If SQLite works with the same standard query as PostgreSQL, feel free to combine those to avoid duplicate code:
@keif888 commented on GitHub (Feb 26, 2025):
I have a blocking issue, and my knowledge of PostgreSQL consists of what I have read in the documentation in the last couple of days.
MariaDB for PhotoPrism is using specific character sets and collations.
They are not deterministic, and are case insensitive.
The default collations for PostgreSQL are all deterministic.
This is causing some queries to fail.
MariaDB startup specifies the following two settings in the startup command for dealing with character strings:
Can someone please let me know how to setup the equivalent in PostgreSQL?
@pashagolub commented on GitHub (Feb 26, 2025):
@keif888 do you need these params for a new database(s) or for the whole instance (all databases)?
We can specify default params for the whole instance, so every database created will inherit those. Or we can control those setting per database level.
Where can I find the startup command for MariaDB to guess the best choice?
@pashagolub commented on GitHub (Feb 26, 2025):
Postgres can do MAX but either we need to type cast explicitly, or to create a custom aggregate
But I feel something is terribly wrong if we're trying to get max of bytea
@keif888 commented on GitHub (Feb 26, 2025):
@pashagolub the MariaDB Docker startup is here:
The max of a bytea I solved as per this --> MAX(convert_from(m.thumb,'UTF8'))
My understanding is that MAX of a bytea is required because PhotoPrism was developed needing both case insensitive and case sensitive matching. And m.thumb (above) is an example of a case sensitive match. I used convert_from to return the value back to a string before doing the MAX so as to match what SQLite and MariaDB are doing. (I do need to retest this to make sure that it really is achieving the same result).
Where case sensitive is needed the developers used VARBINARY(), and where case insensitive is needed they used VARCHAR().
See where MarkerName is VARCHAR, and Thumb is VARBINARY.
Gorm V1
gorm:"type:VARCHAR(160);"gorm:"type:VARCHAR(160);"gorm:"type:VARCHAR(160);"gorm:"type:VARBINARY(128);index;default:'';"gorm:"type:VARBINARY(128);index;default:'';"gorm:"type:VARBINARY(128);index;default:'';"Gorm V2
gorm:"size:160;"gorm:"size:160;"gorm:"size:160;"gorm:"type:bytes;size:128;index;default:'';"gorm:"type:bytes;size:128;index;default:'';"gorm:"type:bytes;size:128;index;default:'';"@lastzero commented on GitHub (Feb 27, 2025):
@keif888 Is this the query you are having trouble with?
github.com/photoprism/photoprism@4a4e45eb59/internal/entity/query/covers.go (L269-L283)If so, don't worry about the
MAX(), because from what I can see/remember, it's just a way to make sure that themarkers.thumbvalue assigned tosubject.thumbis deterministic and not empty (so that the thumb doesn't break or change all the time):Given enough time to think about it, there are probably other (better) ways to solve this problem (as long as the query is deterministic and no empty value is set as
thumb, it doesn't seem to matter too much how thethumbis selected).It is planned (and strongly requested by our users) that the cover images can be set manually from the UI (so the queries to set them automatically will become less important), see https://github.com/photoprism/photoprism/issues/383.
You may find similar patterns elsewhere and are welcome to suggest improvements or solve the same problem in a different way for PostgreSQL/SQLite! In this case, please add a code comment so we can find the relevant queries later to check them and also use them for MariaDB if possible.
@keif888 commented on GitHub (Feb 27, 2025):
@lastzero Yes that was the one I was looking at. That one was easy to make work as it does in the other DBMS'.
The harder one was searchPhotos as the way that MariaDB and SQLite handle GROUP BY is not the same as PostgreSQL. That is going now, and hopefully in a way that doesn't add maintenance nightmares.
I am working to get PostgreSQL working the same way that MariaDB does, before trying to refactor the way that existing SQL statements work.
There a quite a few differences between the SQL DML engines in the 3 DBMS', which makes the complex queries difficult.
BTW: I had to change the PostgreSQL version that I had chosen from 17-alpine to 16-alpine as the PhotoPrism container is using Ubuntu 16.6-0ubuntu0.24.10.1, and that was preventing backup and restore from working.
@keif888 commented on GitHub (Feb 28, 2025):
An now another nasty issue.
After many hours I haven't found a way around this one, yet...
Gorm is returning timestamp's with the time.Local instead of time.UTC.
I have the server running as UTC, the connection string with TimeZone=UTC.
The times are added to the server correctly.
Just when GO gets them back, they have the wrong timezone attached.
eg.
There is a fix for this in the pgx driver, which Gorm is using, but Gorm is unable to utilise that fix from what I can discover.
The fix is to add code similar to this. See also comment here.
BUT, that has to be done if you are using pgx directly, which Gorm doesn't.
It uses it via database/sql or via pgx's stdlib, and neither of those allow access to TypeMap().
There is a possibility that I can do something similar to this Gorm test, but it's using database/sql, and I need to use pgx directly.
I have tried changing the server's timezone, the databases timezone, the connection string's timezone. None of these change the returned value (always Local). PostgreSQL is working as designed.
There is an issue on Gorm similar to it here. It's open, but I think it has 2 issues confused. 1st issue is incorrect timezone in connection string so PsotegreSQL was changing the timestamp on the way in, and the 2nd is the one that we have with pgx marking it as Local.
As an FYI: The test works if I add a .UTC() as shown below (and as per a comment in the issue), but there is no way that is an acceptable solution.
@keif888 commented on GitHub (Feb 28, 2025):
I have raised an issue against gorm for the timestamp location <> UTC.
I replicated it in the go-gorm playground.
https://github.com/go-gorm/gorm/issues/7377
@keif888 commented on GitHub (Mar 1, 2025):
Good News
I worked out how to get a pgxpool into Gorm, so I have the timestamptz working as UTC now.
I've included the work around in the issue noted above, and added it to my branch. That makes the internal/entity tests all pass now.
147 fixed tests done.
Only have the collation issue to solve now, as I'm assuming that will fix the 26 api tests that failed (crossed fingers)
123 failed tests to go.
Bad News
Collation in PostgreSQL can be set at a server or database level, but.... Only if it is deterministic. And we need DETERMINISTIC=false, which can only be done on by creating a collation within a database and applying that to table columns/indexes.
These don't work
BUT, it works in deterministic fashion, so Aaa <> AAA.
Ditto for:
This is what the collation string means
und-u <-- Unicode
ks <-- case sensitive strength
level2 <-- case insensitive, accent sensitive
This works, but...
Only way, and I've read the documentation now, is to create a database, create a collation, and then on every varchar column in the database add the COLLATION.
The problem is that Gorm doesn't have a way to do that only for PostgreSQL. Unless I modify gorm.io/driver/postgres to have an option to add the collation's when migrating. And that's a lot of reflection code that does my head in when I try and work on it.
And what I am doing about it
Other option is to change every varchar based equality/like check to be lower cased.
eg. Any query that does a like or = against structs like this, which are varchar in the databases:
I am working down that path now, as I can't see modifying the driver being a simple change.
But, it is making the code base messy for want of a better term.
eg.
github.com/keif888/photoprism@7867b5f1acAnd I'm concerned that I will miss some if they are buried in First and Find etc.
We will see how the unit testing ends up.
@lastzero commented on GitHub (Mar 1, 2025):
@keif888 Leaving aside case-sensitivity, Unicode is inherently "non-deterministic" when it comes to comparisons, since there are (often many) different encodings for the same characters:
utf8mb4_unicode_cicollation goes a step further and also supports expansions, i.e. when a character is compared as equal to combinations of other characters. But that's nice to have, and if it's slow or complicated with PostgreSQL, it's certainly something our users can live with.@keif888 commented on GitHub (Mar 2, 2025):
Status Report
I have created a draft pull request as all unit tests now pass, and PhotoPrism starts and seems to work without errors and missing photos.
PhotoPrism starts, and allows adding photos. Rescanning finds and adds photos. Search works (flower and Flower both find the same set of flowers for example.
Database encoding is UTF8.
Collation is the OOTB, which in my case defaulted to en_US.utf8 from memory.
To Do:
Find out how to get psql into docker without having to run the command manually every time I restart the docker containerInvestigate InconsistenciesContinue investigation of SQL errors (unique key, foreign key violations) generated from unit tests to ensure they are all from FirstOrCreate functionality, or deliberate error condition testingCreate performance data creation functions for PostgreSQLRun performance testsRerun all unit tests against all three DBMS' just in case@pashagolub commented on GitHub (Mar 2, 2025):
What do you mean by this? psql is the part of Postgres, so it's the part of Docker container. What command do you need to run manually? Why do you need to run that command after container restart?
@keif888 commented on GitHub (Mar 2, 2025):
FYI: I have fixed the issue around postgresql-client being missing. There is a need to ensure that it's included in the base photoprism:develop image though.
@pashagolub
I used the wrong terms above regards psql and containers.
To clear up any confusion.
PhotoPrism development has a number of services that are initiated from a compose.yaml file.
The photoprism app within the photoprism service has to be able to communicate with the postgres service via command line tools for backup and restore.
Some make commands (within make terminal) which run within the photoprism service need the psql command.
Specific commands needed are:
I had a much larger post following the above, with links to everything I had done, and realised that there was an option to call specific make file targets via the PHOTOPRISM_INIT environment setting for photoprism.
So I have updated the compose.postgres.yaml adding postgresql in the list of items to init, and after rebuilding the photoprism service, it is all working.
For those interested:
I had to run the following:
The docker compose build is needed to ensure that the updated makefile and scripts are included in the service, and on restart I saw the following:
@pashagolub commented on GitHub (Mar 3, 2025):
Sorry. I'm trying to catch up but you're too fast for me. :) Would you please elaborate on this? Because I don't see Ubuntu 16 here
@pashagolub commented on GitHub (Mar 3, 2025):
Another thing I want to emphasize. It's better to use a Postgres packages to install packages not system shipped, e.g.
@keif888 commented on GitHub (Mar 3, 2025):
Hi,
I have the output of the os version and psql version from my photoprism service below.
OS Version
psql version
Ubuntu includes the postgresql-client in their list of packages (I'm probably mangling the reality here, I'm not a linux expert), but it's the 16.6 version. And that can not connect to the alpine-17 postgresql service as it's a lower version.
So I chose to use alpine-16 instead. It's still a supported version.
When the ubuntu version is updated for the photoprism service, then (assuming that ubuntu has changed their postgresql version) the compose.postgres.yaml file can be updated to alpine-17.
@pashagolub commented on GitHub (Mar 3, 2025):
I believe, we don't need to rely on any packages shipped with OS. We are in charge of what and how to be used. That said, it's simple to specify the exact version of client we want to have
@keif888 commented on GitHub (Mar 3, 2025):
Yes, and that part I would leave to people that know what they are doing. I know that you have to deal with keys to allow apt to reference other repositories, but then it all starts getting a bit fuzzy. I am in no way an expert on linux or postgresql.
@lastzero commented on GitHub (Mar 3, 2025):
There should already be an install script in /scripts/dist for MariaDB binaries, so you could also add one for PostgreSQL with a fallback to the default system packages.
@keif888 commented on GitHub (Mar 3, 2025):
Preliminary benchmarks of sqlite vs postgres, mariadb vs postres and sqlite vs mariadb.
I create a database of 100k randomly generated photos for each of the DBMS, and then execute some benchmarks against it.
All databases are managed by gorm, so they have the same table, foreign key and index structures.
There is no tuning of the postgres service (I haven't read up on how to do that yet).
Overall postgresql is faster than sqlite, and slower than mariadb.
@keif888 commented on GitHub (Mar 3, 2025):
I cloned that to create one for PostgreSQL.
I have updated it to get the latest version of postgresql-client, and updated the yaml file to 17-alpine.
https://github.com/keif888/photoprism/blob/PostgreSQL/scripts/dist/install-postgresql.sh
https://github.com/keif888/photoprism/blob/PostgreSQL/compose.postgres.yaml
Performance comparison:
@keif888 commented on GitHub (Mar 7, 2025):
Status Report
All unit tests pass.
Latest fixes from gorm2 branch merged in.
No unexpected SQL/gorm errors are being reported.
Inconsistency with MariaDB is an issue within gorm. It does execute the update, but doesn't report the number of records affected correctly.
Pull is ready for review.
@haozhou commented on GitHub (May 8, 2025):
Do we have any guide to migrate data from mysql/mariadb to postgresql?
@keif888 commented on GitHub (May 9, 2025):
Not at this time.
I can start having a look at it so that something similar to https://docs.photoprism.app/getting-started/advanced/migrations/sqlite-to-mariadb/ can be produced.
My initial take on it would be as follows, but I will have to determine appropriate commands, and ensure that it's all achievable:
@keif888 commented on GitHub (May 15, 2025):
Coming soon, a new option for the migrations command line in PhotoPrism:
This new command will create the tables in a new database or migrate the tables to the latest version and truncate them (force option), and migrate all the data across from the currently configured database into the target.
The target can be any dbms that PhotoPrism supports (SQLite, MariaDB and PostgreSQL).
It uses a set of cli flags to provide the target dbms information.
@haozhou commented on GitHub (May 15, 2025):
This is great. Previously my concerns of manually backup -> restore to a new (different) RDBMS was the compatibility of the schema (what if some column has been renamed during the version evolution that doesn't have a corresponding match in the new target RDBMS) I can't wait for the release of this command.
@halkeye commented on GitHub (May 20, 2025):
I was thinking about setting up photoprism again and stumbled upon this. I see at least https://github.com/photoprism/photoprism/pull/4831 isn't merged yet. Is there anything us randos can do to help test or code or otherwise help out? I know personally I have better infra/scripts for setting up postgres dbs instead of mysql so i'm pretty excited.
@apavelm commented on GitHub (Jun 25, 2025):
Any ETA when this PR will be completed and feature released?
Thanks
@graciousgrey commented on GitHub (Jul 30, 2025):
Here's a quick update on this issue: Our team is currently finishing work on the batch edit feature and other PRs. Once that's done, we'll review the latest changes and build a new PostgreSQL Docker image, so you can help with testing!
@keif888 commented on GitHub (Aug 27, 2025):
Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability.
This is up to date with the develop branch as at August 24th 2025 commit
e80c1e1.The documentation on the migration command which allows transferring your index database between DBMS' is here, as it hasn't been merged into the live documentation yet.
The documentation includes sample PostgreSQL yaml configuration to add to your compose.yaml file.
It would be great if you could help us test this please.
FYI:
I perform my testing by snapshotting all my MariaDB based PhotoPrism "drives", and mounting those snapshots as new "drives" and using those snapshots for testing via updated volumes.
See https://docs.photoprism.app/getting-started/advanced/docker-volumes/
@iconoclasthero commented on GitHub (Oct 8, 2025):
FWIW, my psql is bare metal and it took me a while to get it to work. As I just set photoprism up, I abandoned the old DB rather than try to import it into psql so i cannot comment on that.
Some hints for setting up on a bare-metal psql server:
Changes to the docker-compose.yaml:
Needed to create the proper user/password/database/permissions in psql.
Made sure that the docker bridge was allowed in
/etc/postgresql/17/main/pg_hba.confI also had to open up the UFW/iptables firewall rules in order to get it to connect.
Maybe some other things as well, but those're the main things I remember.
@celebilercem commented on GitHub (Nov 14, 2025):
@keif888 Thanks! I see only
amd64images are being released under thepostgrestag. Any plans forarm64?@keif888 commented on GitHub (Nov 15, 2025):
@lastzero would have to build that and push it to docker.
Worst case,
arm64will be available when this is merged into the develop branch.@graciousgrey commented on GitHub (Nov 25, 2025):
@celebilercem We are currently busy finalizing a new stable release. Once that's done, @lastzero can build a PostgreSQL test image for ARM64.
@davidandreoletti commented on GitHub (Jan 20, 2026):
@lastzero do you want to share any eventual progress since @graciousgrey 's above comment ?
@lastzero commented on GitHub (Jan 21, 2026):
@davidandreoletti Thanks for the reminder! I've just merged the latest changes to our
feature/postgresbranch, adjusted theMakefiletarget, and started a new multi-arch Docker build for bothlinux/amd64andlinux/arm64:👉 https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres
Please let us know if the new image works for you! 👌
@keif888 While running the unit tests, I noticed the following errors:
This may not actually be a problem, as the errors could be related to my pre-existing MariaDB database or user accounts? 🤔
@lastzero commented on GitHub (Jan 29, 2026):
I merged the latest changes to our feature/postgres branch and started a new multi-arch Docker build for further testing:
👉 https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres
Any feedback would be much appreciated!
@lastzero commented on GitHub (Feb 13, 2026):
Updated PostgreSQL preview builds are available for testing on Docker Hub: