Config: Add support for PostgreSQL #45

Open
opened 2026-02-19 23:01:25 -05:00 by deekerman · 115 comments
Owner

Originally created by @sokoow on GitHub (Oct 24, 2018).

Originally assigned to: @keif888 on GitHub.

Nice idea lads, I totally support it. Were youever wondering to switch to postgres ? For the deployment size I'm predicting this going to have, mysql might be a bit suboptimal choice :)


Details on possible implementation strategies can be found in this comment:

Originally created by @sokoow on GitHub (Oct 24, 2018). Originally assigned to: @keif888 on GitHub. Nice idea lads, I totally support it. Were youever wondering to switch to postgres ? For the deployment size I'm predicting this going to have, mysql might be a bit suboptimal choice :) --- **Details on possible implementation strategies can be found in this comment:** - https://github.com/photoprism/photoprism/issues/47#issuecomment-1793392875
Author
Owner

@lastzero commented on GitHub (Oct 24, 2018):

Not right now, but in general: anything to store a few tables will do... as simple and stable as possible... many developers are familiar with mysql, so that's my default when I start a new project. Tooling is also good.

sqlite is a very lean option, but obviously - if you run multiple processes or want to directly access / backup your data - it doesn't scale well or at all.

@lastzero commented on GitHub (Oct 24, 2018): Not right now, but in general: anything to store a few tables will do... as simple and stable as possible... many developers are familiar with mysql, so that's my default when I start a new project. Tooling is also good. sqlite is a very lean option, but obviously - if you run multiple processes or want to directly access / backup your data - it doesn't scale well or at all.
Author
Owner

@lastzero commented on GitHub (Nov 16, 2018):

It became clear that we have to build a single binary for distribution to reach broad adoption. Differences between SQL dialects are too large to have them abstracted away by our current ORM library, for example when doing date range queries. They are already different between MySQL and sqlite.

For those reasons we will not implement Postgres support for our MVP / first release. If you have time & energy, you are welcome to help us. I will close this issue for now, we can revisit it later when there is time and enough people want this 👍

@lastzero commented on GitHub (Nov 16, 2018): It became clear that we have to build a single binary for distribution to reach broad adoption. Differences between SQL dialects are too large to have them abstracted away by our current ORM library, for example when doing date range queries. They are already different between MySQL and sqlite. For those reasons we will not implement Postgres support for our MVP / first release. If you have time & energy, you are welcome to help us. I will close this issue for now, we can revisit it later when there is time and enough people want this 👍
Author
Owner

@sokoow commented on GitHub (Nov 17, 2018):

ok fair point - I was raising this because cost of maintenance and troubleshooting at scale is much lower with postgres, and lots of succesfull projects have this support. so, from what you wrote about differences, it seems that you don't have pluggable orm-like generic read/write storage methods just yet, right ?

@sokoow commented on GitHub (Nov 17, 2018): ok fair point - I was raising this because cost of maintenance and troubleshooting at scale is much lower with postgres, and lots of succesfull projects have this support. so, from what you wrote about differences, it seems that you don't have pluggable orm-like generic read/write storage methods just yet, right ?
Author
Owner

@lastzero commented on GitHub (Nov 17, 2018):

@sokoow We do use GORM, but it doesn't help with search queries that use database specific SQL.

If you like to dive into the subject, DATEDIFF is a great example: MySQL and SQL Server use DATEDIFF(), Postgres seems to prefer DATE_PART() whereas sqlite only has julianday().

It goes even deeper when you look into how tables are organized. You can't abstract and optimize at the same time. We want to provide the best performance to our users.

See

@lastzero commented on GitHub (Nov 17, 2018): @sokoow We do use GORM, but it doesn't help with search queries that use database specific SQL. If you like to dive into the subject, `DATEDIFF` is a great example: MySQL and SQL Server use `DATEDIFF()`, Postgres seems to prefer `DATE_PART()` whereas sqlite only has `julianday()`. It goes even deeper when you look into how tables are organized. You can't abstract and optimize at the same time. We want to provide the best performance to our users. See - http://www.sqlines.com/postgresql/how-to/datediff - https://www.sqlite.org/lang_datefunc.html
Author
Owner

@sokoow commented on GitHub (Nov 17, 2018):

No that's a fair point, you're not the first project that has this challenge - something to think about on higher abstraction level.

@sokoow commented on GitHub (Nov 17, 2018): No that's a fair point, you're not the first project that has this challenge - something to think about on higher abstraction level.
Author
Owner

@LeKovr commented on GitHub (Nov 17, 2018):

If you have time & energy, you are welcome to help us.

I guess it won't be so hard, so I would try

@LeKovr commented on GitHub (Nov 17, 2018): > If you have time & energy, you are welcome to help us. I guess it won't be so hard, so I would try
Author
Owner

@lastzero commented on GitHub (Nov 17, 2018):

Getting it to work somehow at a single point in time is not hard, getting it to work with decent performance, finding developers who are comfortable with it and constantly maintaining the code is incredibly hard.

Keep in mind: You also need to maintain continuous integration infrastructure and effectively run all tests with every database.

@lastzero commented on GitHub (Nov 17, 2018): Getting it to work somehow at a single point in time is not hard, getting it to work with decent performance, finding developers who are comfortable with it and constantly maintaining the code is incredibly hard. Keep in mind: You also need to maintain continuous integration infrastructure and effectively run all tests with every database.
Author
Owner

@LeKovr commented on GitHub (Nov 20, 2018):

Ofcourse, tests might be same for every supported database and this might be solved within #60.
Also, sqlite support will probably entail some architectural changes (like search using Bleve and db driver dependent sql queries). It won't be hard to add postgresql support after that. And may be you'll find "developers who are comfortable with it" by this time

@LeKovr commented on GitHub (Nov 20, 2018): Ofcourse, tests might be same for every supported database and this might be solved within #60. Also, sqlite support will probably entail some architectural changes (like search using Bleve and db driver dependent sql queries). It won't be hard to add postgresql support after that. And may be you'll find "developers who are comfortable with it" by this time
Author
Owner

@lastzero commented on GitHub (Nov 20, 2018):

@LeKovr Did you see how we renamed the label from "rejected" to "descoped"? 😉 Yes indeed, later it might be great to add support for additional databases, if users actually need it in practice. Maybe everyone will be happy with an embedded database if we do it well. It is hard to predict.

What I meant was that if you change some code that involves SQL you might feel uncomfortable because you only have experience with one database, so you end up doing nothing. And that can be very dangerous for a project.

@lastzero commented on GitHub (Nov 20, 2018): @LeKovr Did you see how we renamed the label from "rejected" to "descoped"? 😉 Yes indeed, later it might be great to add support for additional databases, if users actually need it in practice. Maybe everyone will be happy with an embedded database if we do it well. It is hard to predict. What I meant was that if you change some code that involves SQL you might feel uncomfortable because you only have experience with one database, so you end up doing nothing. And that can be very dangerous for a project.
Author
Owner

@LeKovr commented on GitHub (Nov 20, 2018):

@lastzero, You are right. May be later. There are more important things to do by now

@LeKovr commented on GitHub (Nov 20, 2018): @lastzero, You are right. May be later. There are more important things to do by now
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

I had a quick look and it looks like the queries at least are trivial to add. The biggest problem is the models. The varbinary and datetime types are hard-coded into the models but don't exist in PostgreSQL, so the migration fails.

I'm not sure what the solution is here. I'd guess that the solution is to use the types Gorm expects (e.g. []byte instead of string when you want a column filled with bytes) but there's probably a good reason why it wasn't done that way to start with.

I'll play with it some more and see. It'd be nice to put everything in my PostgreSQL DB instead of SQLite.

@bobobo1618 commented on GitHub (Jul 9, 2020): I had a quick look and it looks like the queries at least are trivial to add. The biggest problem is the models. The `varbinary` and `datetime` types are hard-coded into the models but don't exist in PostgreSQL, so the migration fails. I'm not sure what the solution is here. I'd guess that the solution is to use the types Gorm expects (e.g. `[]byte` instead of `string` when you want a column filled with bytes) but there's probably a good reason why it wasn't done that way to start with. I'll play with it some more and see. It'd be nice to put everything in my PostgreSQL DB instead of SQLite.
Author
Owner

@LeKovr commented on GitHub (Jul 9, 2020):

The varbinary and datetime types are hard-coded into the models

may be create domain varbinary... may helps

@LeKovr commented on GitHub (Jul 9, 2020): > The varbinary and datetime types are hard-coded into the models may be `create domain varbinary...` may helps
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

All of the varbinary have different lengths and seem to have different purposes, so I don't think that'll help unfortunately.

@bobobo1618 commented on GitHub (Jul 9, 2020): All of the `varbinary` have different lengths and seem to have different purposes, so I don't think that'll help unfortunately.
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

Yes, we use binary for plain ASCII, especially when strings need to be sorted, indexed or compared and should not be normalized in any way.

@lastzero commented on GitHub (Jul 9, 2020): Yes, we use binary for plain ASCII, especially when strings need to be sorted, indexed or compared and should not be normalized in any way.
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

Shouldn't that be the case by default for string fields? I know MySQL does some stupid stuff with character encodings but it shouldn't modify plain ASCII, right?

@bobobo1618 commented on GitHub (Jul 9, 2020): Shouldn't that be the case by default for string fields? I know MySQL does some stupid stuff with character encodings but it shouldn't modify plain ASCII, right?
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

But it uses 4 BYTES per ASCII character, so the index becomes very big. Also when you compare strings, it's somewhat more complex with unicode than just to compare bytes. I'm aware you can PROBABLY do the same with VARCHAR with the right settings and enough time to test, but it was hard to see business value in such experiments.

@lastzero commented on GitHub (Jul 9, 2020): But it uses 4 BYTES per ASCII character, so the index becomes very big. Also when you compare strings, it's somewhat more complex with unicode than just to compare bytes. I'm aware you can PROBABLY do the same with VARCHAR with the right settings and enough time to test, but it was hard to see business value in such experiments.
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

But it uses 4 BYTES per ASCII character

As far as I can tell looking at the SQLite docs, the MySQL docs and PostgreSQL docs, that isn't the case at all. A varchar uses a 1-4 byte prefix depending on the size of the field but each byte of payload consumes one byte of storage.

Also when you compare strings, it's somewhat more complex with unicode than just to compare bytes.

But we're not storing unicode, we're storing ASCII in a field that could contain unicode. I don't think any of those edge-cases apply here.

I'm aware you can PROBABLY do the same with VARCHAR with the right settings and enough time to test, but it was hard to see business value in such experiments.

Fair enough.

Also, queries aren't so straightforward after all. The queries extensively use 0 and 1 instead of false and true, which isn't supported by PostgreSQL (and as a side note, makes the query more difficult to read, since you don't know if it's meant to be a boolean comparison or an integer comparison).

I managed to do a little bit of cleanup of that and managed to get something working at least.

@bobobo1618 commented on GitHub (Jul 9, 2020): > But it uses 4 BYTES per ASCII character As far as I can tell looking at the [SQLite docs](https://sqlite.org/src4/doc/trunk/www/data_encoding.wiki), the [MySQL docs](https://dev.mysql.com/doc/refman/8.0/en/char.html) and [PostgreSQL docs](https://www.postgresql.org/docs/9.2/datatype-character.html), that isn't the case at all. A `varchar` uses a 1-4 byte prefix depending on the size of the field but each byte of payload consumes one byte of storage. > Also when you compare strings, it's somewhat more complex with unicode than just to compare bytes. But we're not storing unicode, we're storing ASCII in a field that _could_ contain unicode. I don't think any of those edge-cases apply here. > I'm aware you can PROBABLY do the same with VARCHAR with the right settings and enough time to test, but it was hard to see business value in such experiments. Fair enough. Also, queries aren't so straightforward after all. The queries extensively use `0` and `1` instead of `false` and `true`, which isn't supported by PostgreSQL (and as a side note, makes the query more difficult to read, since you don't know if it's meant to be a boolean comparison or an integer comparison). I managed to do a little bit of cleanup of that and managed to get _something_ working at least.
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

Not in the index, check again. Maybe also not in memory when comparing.

@lastzero commented on GitHub (Jul 9, 2020): Not in the index, check again. Maybe also not in memory when comparing.
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

I couldn't find documentation so I just ran a quick test to see.

import sqlite3
c = sqlite3.connect('test.db')
c.execute('CREATE TABLE things (integer PRIMARY KEY, testcolumn varchar(32))')
c.execute('CREATE INDEX test_idx ON things(testcolumn)')
for x in range(0, 10000):
    c.execute('INSERT INTO things(testcolumn) VALUES (?)', (hex(x * 472882049 % 15485867),))
c.commit()

Which resulted in 79.3k of actual data:

SELECT SUM(length(testcolumn)) FROM things;
79288

I analyzed it with sqlite3_analyzer.

Table:

Bytes of storage consumed......................... 167936
Bytes of payload.................................. 109288      65.1%
Bytes of metadata................................. 50517       30.1%

Index:

Bytes of storage consumed......................... 163840
Bytes of payload.................................. 129160      78.8%
Bytes of metadata................................. 30476       18.6%

So for 79288 bytes of actual data sitting in the column, we have 109288 bytes total for the data itself (1.38 bytes per byte) and 129160 for the index (1.63 bytes per byte).

I repeated the test with varbinary(32) instead of varchar(32) and got precisely the same result, down to the exact number of bytes.

So I don't see any evidence that a varchar consumes more space in an index than a varbinary.

@bobobo1618 commented on GitHub (Jul 9, 2020): I couldn't find documentation so I just ran a quick test to see. ```python import sqlite3 c = sqlite3.connect('test.db') c.execute('CREATE TABLE things (integer PRIMARY KEY, testcolumn varchar(32))') c.execute('CREATE INDEX test_idx ON things(testcolumn)') for x in range(0, 10000): c.execute('INSERT INTO things(testcolumn) VALUES (?)', (hex(x * 472882049 % 15485867),)) c.commit() ``` Which resulted in 79.3k of actual data: ```sql SELECT SUM(length(testcolumn)) FROM things; 79288 ``` I analyzed it with `sqlite3_analyzer`. Table: ``` Bytes of storage consumed......................... 167936 Bytes of payload.................................. 109288 65.1% Bytes of metadata................................. 50517 30.1% ``` Index: ``` Bytes of storage consumed......................... 163840 Bytes of payload.................................. 129160 78.8% Bytes of metadata................................. 30476 18.6% ``` So for 79288 bytes of actual data sitting in the column, we have 109288 bytes total for the data itself (1.38 bytes per byte) and 129160 for the index (1.63 bytes per byte). I repeated the test with `varbinary(32)` instead of `varchar(32)` and got precisely the same result, down to the exact number of bytes. So I don't see any evidence that a `varchar` consumes more space in an index than a `varbinary`.
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

You'll find some information on this page: https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-conversion.html

You might also want to read this and related RFCs: https://en.wikipedia.org/wiki/Comparison_of_Unicode_encodings

Note that Microsoft, as far as I know, still uses UCS-2 instead of UTC-8 in Windows, for all the reasons I mentioned. Maybe they switched to UTF-16. Their Linux database driver for SQL Server used null terminated strings, guess how well this works with UCS-2. Not at all.

For MySQL, we use 4 byte UTF8, which needs 4 bytes in indexes unless somebody completely refactored InnoDB in the meantime. Note that the MySQL manual was wrong on InnoDB for a long time, insisting that MySQL doesn't know or support indexed organized tables while InnoDB ONLY uses index organized tables.

When you're done with this, enjoy learning about the four Unicode normalization forms: https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization

Did you know there's a difference between Linux und OS X? Apple uses decomposed, so you need to convert all strings when copying files. Their bundled command line tools were not compiled with iconv support, so you had to compile it yourself. Some of this still not fixed until today.

@lastzero commented on GitHub (Jul 9, 2020): You'll find some information on this page: https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-conversion.html You might also want to read this and related RFCs: https://en.wikipedia.org/wiki/Comparison_of_Unicode_encodings Note that Microsoft, as far as I know, still uses UCS-2 instead of UTC-8 in Windows, for all the reasons I mentioned. Maybe they switched to UTF-16. Their Linux database driver for SQL Server used null terminated strings, guess how well this works with UCS-2. Not at all. For MySQL, we use 4 byte UTF8, which needs 4 bytes in indexes unless somebody completely refactored InnoDB in the meantime. Note that the MySQL manual was wrong on InnoDB for a long time, insisting that MySQL doesn't know or support indexed organized tables while InnoDB ONLY uses index organized tables. When you're done with this, enjoy learning about the four Unicode normalization forms: https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization Did you know there's a difference between Linux und OS X? Apple uses decomposed, so you need to convert all strings when copying files. Their bundled command line tools were not compiled with iconv support, so you had to compile it yourself. Some of this still not fixed until today.
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

Note that Sqlite ignores VARBINARY and probably also VARCHAR to some degree. It uses dynamic typing. That's why all string keys are prefixed with at least once non-numeric character. It would convert the value to INT otherwise and comparisons with binary data or strings would fail:

SQLite uses a more general dynamic type system. In SQLite, the datatype of a value is associated with the value itself, not with its container. The dynamic type system of SQLite is backwards compatible with the more common static type systems of other database engines in the sense that SQL statements that work on statically typed databases should work the same way in SQLite. However, the dynamic typing in SQLite allows it to do things which are not possible in traditional rigidly typed databases.

See https://www.sqlite.org/datatype3.html

@lastzero commented on GitHub (Jul 9, 2020): Note that Sqlite ignores VARBINARY and probably also VARCHAR to some degree. It uses dynamic typing. That's why all string keys are prefixed with at least once non-numeric character. It would convert the value to INT otherwise and comparisons with binary data or strings would fail: > SQLite uses a more general dynamic type system. In SQLite, the datatype of a value is associated with the value itself, not with its container. The dynamic type system of SQLite is backwards compatible with the more common static type systems of other database engines in the sense that SQL statements that work on statically typed databases should work the same way in SQLite. However, the dynamic typing in SQLite allows it to do things which are not possible in traditional rigidly typed databases. See https://www.sqlite.org/datatype3.html
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

I'm aware of Unicode encodings and some of the important differences between them. I still don't see anything in the docs indicating that using a varchar containing ASCII will consume 4 bytes in an index but I'll take your word for it.

To be clear, in case there's some miscommunication going on, my assumption is that even if the column is switched to varchar, plain ASCII (i.e. the first 128 unicode code points, which are all encoded with 8 bits) will still be stored in it. That being the case, 1 character = 1 byte and comparisons are bog-standard string comparisons.

In other news, here's a PoC of PostgreSQL mostly working. It's intended as an overview of the work that needs to be done, not as a serious proposal.

@bobobo1618 commented on GitHub (Jul 9, 2020): I'm aware of Unicode encodings and some of the important differences between them. I still don't see anything in the docs indicating that using a `varchar` containing ASCII will consume 4 bytes in an index but I'll take your word for it. To be clear, in case there's some miscommunication going on, my assumption is that even if the column is switched to `varchar`, plain ASCII (i.e. the first 128 unicode code points, which are all encoded with 8 bits) will still be stored in it. That being the case, 1 character = 1 byte and comparisons are bog-standard string comparisons. In other news, [here's a PoC](https://github.com/photoprism/photoprism/commit/f1c0fcab21b21dbc08ce01d48f0b0ffc04e952b9) of PostgreSQL mostly working. It's intended as an overview of the work that needs to be done, not as a serious proposal.
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

Actually on string vs. []byte it occurred to me, if you only want to store ASCII here and don't want to treat this like a thing that's semantically like a string, is it a bad thing to use a []byte type? Is it the hassle of converting to/from strings when dealing with other APIs that's offputting?

With []byte, gorm will choose an appropriate type for each DB by default.

@bobobo1618 commented on GitHub (Jul 9, 2020): Actually on `string` vs. `[]byte` it occurred to me, if you only want to store ASCII here and don't want to treat this like a thing that's semantically like a string, is it a bad thing to use a `[]byte` type? Is it the hassle of converting to/from strings when dealing with other APIs that's offputting? With `[]byte`, gorm will choose an appropriate type for each DB by default.
Author
Owner

@bobobo1618 commented on GitHub (Jul 9, 2020):

Ah, looks like the string vs. []byte is mostly solved by gorm V2 anyhow. You'll just be able to put type:bytes in the tag and it'll handle it for you.

@bobobo1618 commented on GitHub (Jul 9, 2020): Ah, looks like the `string` vs. `[]byte` [is mostly solved by gorm V2 anyhow](https://github.com/go-gorm/gorm/blob/90a40361ed38314b8ea45e703a14f0ed58925892/schema/field.go#L242-L249). You'll just be able to put `type:bytes` in the tag and it'll handle it for you.
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

See https://mathiasbynens.be/notes/mysql-utf8mb4

The InnoDB storage engine has a maximum index length of 767 bytes, so for utf8 or utf8mb4 columns, you can index a maximum of 255 or 191 characters, respectively. If you currently have utf8 columns with indexes longer than 191 characters, you will need to index a smaller number of characters when using utf8mb4. (Because of this, I had to change some indexed VARCHAR(255) columns to VARCHAR(191).)

Maybe we can switch to []byte in Go. Let's revisit this later, there are a ton of items on our todo with higher priority and that's by far not the only change we need to support other databases.

Edit: As you can see in the code, I already implemented basic support for multiple dialects when we added Sqlite. For Postgres there's more to consider, especially data types. Sqlite pretty much doesn't care. Bool columns and date functions might also need attention. I'm fully aware Postgres is very popular in the GIS community, so it will be worth adding when we have the resources needed to implement and maintain it (see next comment).

@lastzero commented on GitHub (Jul 9, 2020): See https://mathiasbynens.be/notes/mysql-utf8mb4 > The InnoDB storage engine has a maximum index length of 767 bytes, so for utf8 or utf8mb4 columns, you can index a maximum of 255 or 191 characters, respectively. If you currently have utf8 columns with indexes longer than 191 characters, you will need to index a smaller number of characters when using utf8mb4. (Because of this, I had to change some indexed VARCHAR(255) columns to VARCHAR(191).) Maybe we can switch to []byte in Go. Let's revisit this later, there are a ton of items on our todo with higher priority and that's by far not the only change we need to support other databases. Edit: As you can see in the code, I already implemented basic support for multiple dialects when we added Sqlite. For Postgres there's more to consider, especially data types. Sqlite pretty much doesn't care. Bool columns and date functions might also need attention. I'm fully aware Postgres is very popular in the GIS community, so it will be worth adding when we have the resources needed to implement and maintain it (see next comment).
Author
Owner

@lastzero commented on GitHub (Jul 9, 2020):

We also need to consider the impact on testing and continuous integration when adding support for additional storage engines and APIs. That's often underestimated and causes permanent overhead. From a contributor's perspective, it might just be a one time pull request. Anyhow, we value your efforts and feedback! Just so that you see why we're cautious.

@lastzero commented on GitHub (Jul 9, 2020): We also need to consider the impact on testing and continuous integration when adding support for additional storage engines and APIs. That's often underestimated and causes permanent overhead. From a contributor's perspective, it might just be a one time pull request. Anyhow, we value your efforts and feedback! Just so that you see why we're cautious.
Author
Owner

@skorokithakis commented on GitHub (Dec 31, 2020):

Would you consider leaving this issue open so people can 👍 it? I almost didn't find it.

@skorokithakis commented on GitHub (Dec 31, 2020): Would you consider leaving this issue open so people can :+1: it? I almost didn't find it.
Author
Owner

@lastzero commented on GitHub (Dec 31, 2020):

Note that only Gorm 2 aka 1.2 supports compatible general types. I already tried upgrading, but it turned out to be extremely time consuming and tedious due to many changes. We decided to rather release earlier and without bugs, than to go for Postgres support in our first release.

If you look at our public roadmap, you'll notice that there are a ton of important feature requests that deliver value to regular users. So don't expect us to completely refactor our storage layer in the next few months 👍

If we get it done earlier, that's good... but no commitment at this point. As this is the core of our app, we can also not "just" merge a pull request. There are too many edge cases to keep in mind.

@lastzero commented on GitHub (Dec 31, 2020): Note that only Gorm 2 aka 1.2 supports [compatible general types](https://gorm.io/docs/models.html). I already tried upgrading, but it turned out to be extremely time consuming and tedious due to many changes. We decided to rather release earlier and without bugs, than to go for Postgres support in our first release. If you look at our [public roadmap](https://github.com/photoprism/photoprism/projects/5), you'll notice that there are a ton of important feature requests that deliver value to regular users. So don't expect us to completely refactor our storage layer in the next few months 👍 If we get it done earlier, that's good... but no commitment at this point. As this is the core of our app, we can also not "just" merge a pull request. There are too many edge cases to keep in mind.
Author
Owner

@skorokithakis commented on GitHub (Dec 31, 2020):

Fair enough. Most of my friends use Postgres (as do I), and I don't think I know anyone who prefers MariaDB, but SQLite is a good second choice so I just went with that.

@skorokithakis commented on GitHub (Dec 31, 2020): Fair enough. Most of my friends use Postgres (as do I), and I don't think I know anyone who prefers MariaDB, but SQLite is a good second choice so I just went with that.
Author
Owner

@lastzero commented on GitHub (Dec 31, 2020):

We initially started with a built-in TiDB server. When that caused issues, we simply sticked with MySQL-compatible databases. SQLite is not that much different, and ignores most data types anyway. MariaDB works well for what we do. Migrating from a more to a less powerful DBMS is much more difficult when your app depends on all the features. Note that SQLite is much slower than MariaDB in every regard, especially on powerful servers with many logical cores due to file based locking. We don't want to get any bad performance reviews! 😉

@lastzero commented on GitHub (Dec 31, 2020): We initially started with a built-in TiDB server. When that caused issues, we simply sticked with MySQL-compatible databases. SQLite is not that much different, and ignores most data types anyway. MariaDB works well for what we do. Migrating from a more to a less powerful DBMS is much more difficult when your app depends on all the features. Note that SQLite is much slower than MariaDB in every regard, especially on powerful servers with many logical cores due to file based locking. We don't want to get any bad performance reviews! 😉
Author
Owner

@skorokithakis commented on GitHub (Dec 31, 2020):

Haha, don't worry, my server is plenty slow already so that's what I blame :P It's just too bad that using an ORM compatible with all three RDBMSes is hard to do at this point, but I agree, features are more important.

@skorokithakis commented on GitHub (Dec 31, 2020): Haha, don't worry, my server is plenty slow already so that's what I blame :P It's just too bad that using an ORM compatible with all three RDBMSes is hard to do at this point, but I agree, features are more important.
Author
Owner

@G2G2G2G commented on GitHub (Jun 6, 2021):

@bobobo1618 datetime has existed in postgresql for ~10 years and varbinary = bit datatype

...sqlite has no locking for reads, only writes. Scales very well for multi-user, many core, read heavy systems. Just not writing. Also all datatypes in sqlite are treated the same. How it is inserted is how the specific "cell" treats the data, this is all over the docs.

@G2G2G2G commented on GitHub (Jun 6, 2021): @bobobo1618 datetime has existed in postgresql for ~10 years and varbinary = bit datatype ...sqlite has no locking for reads, only writes. Scales very well for multi-user, many core, read heavy systems. Just not writing. **Also all datatypes in sqlite are treated the same. How it is inserted is how the specific "cell" treats the data, this is all over the docs.**
Author
Owner

@bobobo1618 commented on GitHub (Jun 6, 2021):

@bobobo1618 datetime has existed in postgresql for ~10 years and varbinary = bit datatype

I think you misunderstood. Of course PostgreSQL has equivalent types (timestamp for datetime, bit varying(n) for varbinary), the problem is that PhotoPrism hard-codes the specific names datetime and varbinary, which are not understood by PostgreSQL.

@bobobo1618 commented on GitHub (Jun 6, 2021): > @bobobo1618 datetime has existed in postgresql for ~10 years and varbinary = bit datatype I think you misunderstood. Of course PostgreSQL has _equivalent_ types ([`timestamp`](https://www.postgresql.org/docs/13/datatype-datetime.html) for `datetime`, [`bit varying(n)`](https://www.postgresql.org/docs/13/datatype-bit.html) for `varbinary`), the problem is that PhotoPrism hard-codes the specific names `datetime` and `varbinary`, which are not understood by PostgreSQL.
Author
Owner

@LeKovr commented on GitHub (Jun 6, 2021):

which are not understood by PostgreSQL.

create domain datetime as timestamp;

?

@LeKovr commented on GitHub (Jun 6, 2021): > which are not understood by PostgreSQL. ``` create domain datetime as timestamp; ``` ?
Author
Owner

@myxor commented on GitHub (Jan 19, 2022):

Are there any news to PostgreSQL support?

@myxor commented on GitHub (Jan 19, 2022): Are there any news to PostgreSQL support?
Author
Owner

@graciousgrey commented on GitHub (Jan 19, 2022):

No, we are currently working on multi-user support, which is really an epic.
You can find a list of upcoming features on our roadmap: https://github.com/photoprism/photoprism/projects/5

@graciousgrey commented on GitHub (Jan 19, 2022): No, we are currently working on multi-user support, which is really an epic. You can find a list of upcoming features on our roadmap: https://github.com/photoprism/photoprism/projects/5
Author
Owner

@davralin commented on GitHub (Jan 19, 2022):

Not sure if there's a place to mention it - or if it's really a new issue - but how to migrate between databases would also be nice in addition to "just" supporting postgresql.

@davralin commented on GitHub (Jan 19, 2022): Not sure if there's a place to mention it - or if it's really a new issue - but how to migrate between databases would also be nice in addition to "just" supporting postgresql.
Author
Owner

@lastzero commented on GitHub (Jan 19, 2022):

@lastzero commented on GitHub (Jan 19, 2022): - Wish we already had PostgreSQL support as it is the [technologically better choice](https://docs.photoprism.app/release-notes/#november-28-2021) - There is [(still) too little funding and too much work](https://docs.photoprism.app/funding/) for [our team](https://photoprism.app/team) - As soon as funding improves (which we really hope for), PostgreSQL support will be one of our highest priorities - There should already be tools for migrating from MariaDB to PostgreSQL: - You are invited to do research on this! ✨
Author
Owner

@francisco1844 commented on GitHub (Mar 24, 2022):

Is there a place where people can put money towards a particular feature? I think that would help to see how much existing / future users value a particular feature. Also, for many people it may be more appealing towards a specific feature than just to make a donation and hope that eventually the feature they need will make it.

@francisco1844 commented on GitHub (Mar 24, 2022): Is there a place where people can put money towards a particular feature? I think that would help to see how much existing / future users value a particular feature. Also, for many people it may be more appealing towards a specific feature than just to make a donation and hope that eventually the feature they need will make it.
Author
Owner

@francisco1844 commented on GitHub (Mar 24, 2022):

Don't see Postgresql in the Roadmap, or is it under generic name for other DBs support?

@francisco1844 commented on GitHub (Mar 24, 2022): Don't see Postgresql in the [Roadmap](https://github.com/photoprism/photoprism/projects/5), or is it under generic name for other DBs support?
Author
Owner

@graciousgrey commented on GitHub (Mar 24, 2022):

Here you find an overview of our current funding options. Sponsors in higher tiers can give golden sponsor labels to features.

Is there a place where people can put money towards a particular feature?

Not anymore. While we like IssueHunt and are grateful for the donations we've received so far, it hasn't proven to be a sustainable funding option for us as we spend much of our time maintaining existing features and providing support.
If we don't have enough resources to provide support and bugfixes, we can't start working on new features.

@graciousgrey commented on GitHub (Mar 24, 2022): [Here](https://docs.photoprism.app/funding/) you find an overview of our current funding options. Sponsors in higher tiers can give golden sponsor labels to features. > Is there a place where people can put money towards a particular feature? Not anymore. While we like IssueHunt and are grateful for the donations we've received so far, it hasn't proven to be a sustainable funding option for us as we spend much of our time maintaining existing features and providing support. If we don't have enough resources to provide support and bugfixes, we can't start working on new features.
Author
Owner

@lastzero commented on GitHub (Mar 26, 2022):

Don't see Postgresql in the Roadmap

That's because we plan to support PostgreSQL anyway, ideally when there is less pressure to release new features than right now. We can't perform a major backend/database refactoring while pushing huge new features like multi-user support.

@lastzero commented on GitHub (Mar 26, 2022): > Don't see Postgresql in the [Roadmap](https://github.com/photoprism/photoprism/projects/5) That's because we plan to support PostgreSQL anyway, ideally when there is less pressure to release new features than right now. We can't perform a major backend/database refactoring while pushing huge new features like multi-user support.
Author
Owner

@dradux commented on GitHub (Jun 3, 2022):

I would love to see postgres support! I'll contribute time, talent, and/or treasure.

@dradux commented on GitHub (Jun 3, 2022): I would love to see postgres support! I'll contribute time, talent, and/or treasure.
Author
Owner

@vyruss commented on GitHub (Mar 23, 2023):

I can also contribute Postgres knowledge & time.

@vyruss commented on GitHub (Mar 23, 2023): I can also contribute Postgres knowledge & time.
Author
Owner

@pashagolub commented on GitHub (May 3, 2023):

I can help you with PostgreSQL support.

@pashagolub commented on GitHub (May 3, 2023): I can help you with PostgreSQL support.
Author
Owner

@lastzero commented on GitHub (May 30, 2023):

@pashagolub My apologies for not getting back to you sooner! We had to focus all our resources on the release and then needed a break. Any help with adding PostgreSQL is, of course, much appreciated. There are two basic strategies:

  1. Keep the current ORM (which doesn't support dynamic columns for the auto-migrations) and work around this by using only manual migrations for PostgreSQL. This seems doable to me with a few code changes, but needs to be tested before you invest a lot of time.
  2. Upgrading the ORM, which requires rewriting large chunks of code and re-testing every single detail. This approach seems cleaner, but it could also result in much more work and prevent us from releasing new features for some time, which might not be popular with some users (except those who are just waiting for PostgreSQL support, of course).

Should you decide to tackle this, I'm happy to help and give advice to the best of my ability. Also, if you have any personal questions, feel free to contact me directly via email so as to avoid notifying all issue subscribers on GitHub about a new comment.

@lastzero commented on GitHub (May 30, 2023): @pashagolub My apologies for not getting back to you sooner! We had to focus all our resources on the release and then needed a break. Any help with adding PostgreSQL is, of course, much appreciated. There are two basic strategies: 1. Keep the current ORM (which doesn't support dynamic columns for the auto-migrations) and work around this by using only manual migrations for PostgreSQL. This seems doable to me with a few code changes, but needs to be tested before you invest a lot of time. 2. Upgrading the ORM, which requires rewriting large chunks of code and re-testing every single detail. This approach seems cleaner, but it could also result in much more work and prevent us from releasing new features for some time, which might not be popular with some users (except those who are just waiting for PostgreSQL support, of course). Should you decide to tackle this, I'm happy to help and give advice to the best of my ability. Also, if you have any personal questions, feel free to contact me directly via email so as to avoid notifying all issue subscribers on GitHub about a new comment.
Author
Owner

@Tragen commented on GitHub (May 30, 2023):

The strategy should be doing 1 and then 2. ;)
But after 1 there is often no reason for 2.

@Tragen commented on GitHub (May 30, 2023): The strategy should be doing 1 and then 2. ;) But after 1 there is often no reason for 2.
Author
Owner

@lastzero commented on GitHub (May 30, 2023):

I often wish we had a more compatible, intuitive database abstraction layer. But compared to important core features that are still missing, like batch editing, this is not a big pain point at the moment and therefore not a top priority.

@lastzero commented on GitHub (May 30, 2023): I often wish we had a more compatible, intuitive database abstraction layer. But compared to important core features that are still missing, like batch editing, this is not a big pain point at the moment and therefore not a top priority.
Author
Owner

@pashagolub commented on GitHub (May 30, 2023):

  1. Keep the current ORM

Would you please name it? :) It's hard to find the name in the .mod file without actually knowing it :-)

Speaking about go.mod... I was surprised to see lib/pq dependency. :-D

@pashagolub commented on GitHub (May 30, 2023): > 1. Keep the current ORM Would you please name it? :) It's hard to find the name in the .mod file without actually knowing it :-) Speaking about `go.mod`... I was surprised to see `lib/pq` dependency. :-D
Author
Owner

@lastzero commented on GitHub (May 30, 2023):

We currently use GORM v1, I was assuming this is mentioned/discussed in the comments above: https://v1.gorm.io/docs/

@lastzero commented on GitHub (May 30, 2023): We currently use GORM v1, I was assuming this is mentioned/discussed in the comments above: https://v1.gorm.io/docs/
Author
Owner

@pashagolub commented on GitHub (May 30, 2023):

Sorry. Missed that

@pashagolub commented on GitHub (May 30, 2023): Sorry. Missed that
Author
Owner

@rustygreen commented on GitHub (Sep 8, 2023):

Any update on when we can expect PostgreSQL support?

@rustygreen commented on GitHub (Sep 8, 2023): Any update on when we can expect PostgreSQL support?
Author
Owner

@lastzero commented on GitHub (Sep 9, 2023):

We had several contributors who wanted to work on this. However, there is no pull request for it yet and so I can't tell you anything about the progress.

@lastzero commented on GitHub (Sep 9, 2023): We had several contributors who wanted to work on this. However, there is no pull request for it yet and so I can't tell you anything about the progress.
Author
Owner

@fl0wm0ti0n commented on GitHub (Sep 30, 2023):

any news for postgres support?

@fl0wm0ti0n commented on GitHub (Sep 30, 2023): any news for postgres support?
Author
Owner

@ezra-varady commented on GitHub (Nov 2, 2023):

Are there any contributors working on this atm? My team is interested in this feature, and I might be able to contribute some time

@ezra-varady commented on GitHub (Nov 2, 2023): Are there any contributors working on this atm? My team is interested in this feature, and I might be able to contribute some time
Author
Owner

@lastzero commented on GitHub (Nov 4, 2023):

@ezra-varady We appreciate any help we can get! To reiterate what I wrote above, there are two possible strategies:

  1. Keep the current GORM version (which does not support abstract/dynamic column types if you use auto-migrations to create/update the database schema) and work around this by using our manual migration package to maintain the PostgreSQL schema. This seems doable to me with a few changes in the internal/config package, though it should be tested as a proof-of-concept before you invest a lot of time.
  2. Upgrade GORM from v1 to v2, which requires rewriting large parts of the code and retesting every single detail. This approach may be beneficial in the long run, although it will probably also cause a lot more work and might prevent us from releasing new features for some time, which our users would not be happy about. For this reason, the entire work could of course also be done in a long-lived feature branch until everything is ready. However, you must then be prepared to resolve merge conflicts with our main branch (develop) from time to time until it can finally be merged.

Due to the higher chances of success (and because it doesn't block us from upgrading later), I would personally recommend going for (1), i.e. adding (a) manual migrations (for the initial setup of the database schema in the first step) and (b) hand-written SQL for custom queries for which the ORM is not used, for example:

github.com/photoprism/photoprism@539e18d984/internal/query/covers.go (L27-L49)

Should you decide to tackle this, we will be happy to help and provide advice to the best of our ability. You are also welcome to contact us via email or chat if you have general questions that don't need to be documented as a public issue comment on GitHub.

@lastzero commented on GitHub (Nov 4, 2023): @ezra-varady **We appreciate any help we can get!** To reiterate what I wrote above, there are two possible strategies: 1. **Keep the [current GORM version](https://v1.gorm.io/)** (which does not support abstract/dynamic column types if you use auto-migrations to create/update the database schema) and [work around this by using our manual migration package](https://github.com/photoprism/photoprism/tree/develop/internal/migrate) to maintain the PostgreSQL schema. This seems doable to me [with a few changes in the `internal/config` package](https://github.com/photoprism/photoprism/blob/develop/internal/config/config_db.go), though it should be tested as a proof-of-concept before you invest a lot of time. 4. **Upgrade GORM from v1 to [v2](https://gorm.io/docs/)**, which requires rewriting large parts of the code and retesting every single detail. This approach may be beneficial in the long run, although it will probably also cause a lot more work and might prevent us from releasing new features for some time, which our users would not be happy about. For this reason, the entire work could of course also be done in a long-lived feature branch until everything is ready. However, you must then be prepared to resolve merge conflicts with our main branch (*develop*) from time to time until it can finally be merged. Due to the higher chances of success (and because it doesn't block us from upgrading later), I would personally recommend going for (1), i.e. adding (a) manual migrations (for the initial setup of the database schema in the first step) and (b) hand-written SQL for custom queries for which the ORM is not used, for example: https://github.com/photoprism/photoprism/blob/539e18d984b66b41e5d7b106087068ee5c1c2037/internal/query/covers.go#L27-L49 **Should you decide to tackle this, we will be happy to help and provide advice to the best of our ability. You are also welcome to contact us via email or chat if you have general questions that don't need to be documented as a public issue comment on GitHub.**
Author
Owner

@vnnv commented on GitHub (Nov 24, 2023):

@lastzero did you considered the option to remove GORM at all and replace it with something else? Perhaps a lightweight lib for db access? something similar to github.com/jmoiron/sqlx ?

@vnnv commented on GitHub (Nov 24, 2023): @lastzero did you considered the option to remove GORM at all and replace it with something else? Perhaps a lightweight lib for db access? something similar to github.com/jmoiron/sqlx ?
Author
Owner

@stavros-k commented on GitHub (Nov 24, 2023):

Maybe https://entgo.io/ or https://bun.uptrace.dev/

@stavros-k commented on GitHub (Nov 24, 2023): Maybe https://entgo.io/ or https://bun.uptrace.dev/
Author
Owner

@pashagolub commented on GitHub (Nov 24, 2023):

I think pgx is enough for most of the functionality. But again if we want to be able to talk to different databases, we should come with some kind of database engine. And ORM is not the best choice, because the problem is not in the relation-object mapping but in the logic behind

@pashagolub commented on GitHub (Nov 24, 2023): I think `pgx` is enough for most of the functionality. But again if we want to be able to talk to different databases, we should come with some kind of database engine. And ORM is not the best choice, because the problem is not in the relation-object mapping but in the logic behind
Author
Owner

@lastzero commented on GitHub (Nov 24, 2023):

@vnnv @stavros-k @pashagolub Yes, of course we have also considered switching to a completely different library... There are many more choices now than when we started the project.

That said, some kind of abstraction seems necessary if we want to support multiple dialects with the resources we have. Also, I think it's a good idea to cover simple standard use cases instead of creating every single SQL query manually.

Either way, the amount of work required to switch to a different library would be even greater than what I described in my comment above as 2.: https://github.com/photoprism/photoprism/issues/47#issuecomment-1793392875

Even for 1. and 2. it seems extremely difficult to find contributors with the time and experience required, and my personal time is very limited due to the amount of support and feature requests we receive.

@lastzero commented on GitHub (Nov 24, 2023): @vnnv @stavros-k @pashagolub Yes, of course we have also considered switching to a completely different library... There are many more choices now than when we started the project. That said, some kind of abstraction seems necessary if we want to support multiple dialects with the resources we have. Also, I think it's a good idea to cover simple standard use cases instead of creating every single SQL query manually. Either way, the amount of work required to switch to a different library would be even greater than what I described in my comment above as 2.: https://github.com/photoprism/photoprism/issues/47#issuecomment-1793392875 Even for 1. and 2. it seems extremely difficult to find contributors with the time and experience required, and my personal time is very limited due to the amount of support and feature requests we receive.
Author
Owner

@stavros-k commented on GitHub (Nov 24, 2023):

That said, some kind of abstraction seems necessary if we want to support multiple dialects with the resources we have. Also, I think it's a good idea to cover simple standard use cases instead of creating every single SQL query manually.

What kind of abstraction are you looking for? I saw that is regarding column types.
Do you have columns that you dont know the type before hand?

If you do know it before hand but its "changing" frequently, Ent might be a better option, as you can extend the generated code with some gotemplates. As for migrations I would look into Atlas.

That being said, I was just subscribed into this issue for a long time, and thought I'd share what I found from my recent search for a db lib. As I was looking to start a mini side project.

I wish I had the experience to help with it.

@stavros-k commented on GitHub (Nov 24, 2023): > That said, some kind of abstraction seems necessary if we want to support multiple dialects with the resources we have. Also, I think it's a good idea to cover simple standard use cases instead of creating every single SQL query manually. What kind of abstraction are you looking for? I saw that is regarding column types. Do you have columns that you dont know the type before hand? If you do know it before hand but its "changing" frequently, Ent might be a better option, as you can extend the generated code with some gotemplates. As for migrations I would look into Atlas. That being said, I was just subscribed into this issue for a long time, and thought I'd share what I found from my recent search for a db lib. As I was looking to start a mini side project. I wish I had the experience to help with it.
Author
Owner

@ai2ys commented on GitHub (Jun 18, 2024):

PostgreSQL support would be great, as it offers support for JSON.

@ai2ys commented on GitHub (Jun 18, 2024): PostgreSQL support would be great, as it offers support for JSON.
Author
Owner

@satonotdead commented on GitHub (Jun 19, 2024):

Is that already implemented? There are postgres docker-compose.yml example files on the main repo.

@satonotdead commented on GitHub (Jun 19, 2024): Is that already implemented? There are postgres docker-compose.yml example files on the main repo.
Author
Owner

@lastzero commented on GitHub (Jun 24, 2024):

Adding PostgreSQL support unfortunately depends on the help of contributors, as many users are waiting for OpenID Connect support and improved facial recognition, which I need to take care of. So there is no way I can spend several weeks working on other issues or even half the time to e.g. get a proof-of-concept ready for release:

Keep in mind that while we always do our best to give general advice and feedback (which is not easy considering our workload), refactoring the "associations" and "preload" functionality for the migration from Gorm v1 to v2 seems to require quite a bit of time to experiment and come up with a good solution. So some advice from us and a few tests will probably not be enough to get it done.

For alternative solutions that may not require an upgrade to Gorm v2, see the following comment:

If you have questions, please feel free to ask in our Contributors and/or Community Chat!

@lastzero commented on GitHub (Jun 24, 2024): Adding PostgreSQL support unfortunately depends on the help of contributors, as many users are waiting for OpenID Connect support and improved facial recognition, which I need to take care of. So there is no way I can spend several weeks working on other issues or even half the time to e.g. get a proof-of-concept ready for release: - https://github.com/photoprism/photoprism/pull/4203 Keep in mind that while we always do our best to give general advice and feedback (which is not easy considering our workload), refactoring the "associations" and "preload" functionality for the migration from Gorm v1 to v2 seems to require quite a bit of time to experiment and come up with a good solution. So some advice from us and a few tests will probably not be enough to get it done. For alternative solutions that may not require an upgrade to Gorm v2, see the following comment: - https://github.com/photoprism/photoprism/issues/47#issuecomment-1793392875 If you have questions, please feel free to ask in our Contributors and/or [Community Chat](https://link.photoprism.app/chat)!
Author
Owner

@alan-cugler commented on GitHub (Jan 27, 2025):

Good Afternoon! I see PR #4560 is still marked as a draft with lots of progress made back in October. Did the PhotoPrism team feel significant progress was made by that outside help on PR #4560 to migrate to GORMv2? I was pretty excited by the effort but wasnt sure if it should be taken seriously since it was still marked as "help wanted" on the project road map.

just looking for an update, even if the update is nothing new. cheers!

@alan-cugler commented on GitHub (Jan 27, 2025): Good Afternoon! I see PR #4560 is still marked as a draft with lots of progress made back in October. Did the PhotoPrism team feel significant progress was made by that outside help on PR #4560 to migrate to GORMv2? I was pretty excited by the effort but wasnt sure if it should be taken seriously since it was still marked as "help wanted" on the project road map. just looking for an update, even if the update is nothing new. cheers!
Author
Owner

@keif888 commented on GitHub (Feb 24, 2025):

PR #4560 has now made it out of draft, and hopefully bug free.
To support PostgreSQL will require a docker configuration for PostgreSQL, configuration files for PostgreSQL, and some SQL changes within PhotoPrism where SQL incompatibilities between vendors occur.
I have never configured a docker container, or used PostgreSQL, so it will be a learning curve to get out of the way before I can work on integrating PhotoPrism with PostgreSQL.

@keif888 commented on GitHub (Feb 24, 2025): PR #4560 has now made it out of draft, and hopefully bug free. To support PostgreSQL will require a docker configuration for PostgreSQL, configuration files for PostgreSQL, and some SQL changes within PhotoPrism where SQL incompatibilities between vendors occur. I have never configured a docker container, or used PostgreSQL, so it will be a learning curve to get out of the way before I can work on integrating PhotoPrism with PostgreSQL.
Author
Owner

@pashagolub commented on GitHub (Feb 25, 2025):

Thanks for your hard work @keif888. Would it be enough if I will tune this one compose.postgres.yaml?

@pashagolub commented on GitHub (Feb 25, 2025): Thanks for your hard work @keif888. Would it be enough if I will tune this one [compose.postgres.yaml](https://github.com/photoprism/photoprism/blob/develop/compose.postgres.yaml)?
Author
Owner

@keif888 commented on GitHub (Feb 25, 2025):

Thanks for your hard work @keif888. Would it be enough if I will tune this one compose.postgres.yaml?

I have got one up and going in my branch for PostgreSQL.
https://github.com/keif888/photoprism/tree/PostgreSQL
Now to get keycloak to use the correct schema on startup instead of Public.

@keif888 commented on GitHub (Feb 25, 2025): > Thanks for your hard work [@keif888](https://github.com/keif888). Would it be enough if I will tune this one [compose.postgres.yaml](https://github.com/photoprism/photoprism/blob/develop/compose.postgres.yaml)? I have got one up and going in my branch for PostgreSQL. https://github.com/keif888/photoprism/tree/PostgreSQL Now to get keycloak to use the correct schema on startup instead of Public.
Author
Owner

@pashagolub commented on GitHub (Feb 26, 2025):

@keif888 I can help you with that. What is keycloak and what schema and under what role it should be operated in the database?

@pashagolub commented on GitHub (Feb 26, 2025): @keif888 I can help you with that. What is keycloak and what schema and under what role it should be operated in the database?
Author
Owner

@lastzero commented on GitHub (Feb 26, 2025):

Keycloak is a third-party service for testing authentication with OpenID Connect, so it should not need to be migrated to PostgreSQL (it is used through a web API).

Note that we have already started working on a Docker Compose configuration for testing and development with PostgreSQL as index database:

@lastzero commented on GitHub (Feb 26, 2025): Keycloak is a third-party service for testing authentication with OpenID Connect, so it should not need to be migrated to PostgreSQL (it is used through a web API). Note that we have already started working on a Docker Compose configuration for testing and development with PostgreSQL as index database: - https://github.com/photoprism/photoprism/blob/develop/compose.postgres.yaml
Author
Owner

@pashagolub commented on GitHub (Feb 26, 2025):

Yes, that's why I was surprised why are we talking about keycloak schema if we suppose to run it as a separate service and just provide the proper database name to operate

@pashagolub commented on GitHub (Feb 26, 2025): Yes, that's why I was surprised why are we talking about keycloak schema if we suppose to run it as a separate service and just provide the proper database name to operate
Author
Owner

@keif888 commented on GitHub (Feb 26, 2025):

I am working on an assumption that everything that PhotoPrism's compose.yaml does with MariaDB has to be done in PostgreSQL. So I updated the compose.postgres.yaml to be the "same" as the compose.yaml. ie. it has all the same services.

I have Keycloak working in PostgreSQL now, same as it does in MariaDB.
My init script for PostgreSQL was incorrect, which I have corrected.

Current status is:
PhotoPrism starts, you can upload a photo, and it is uploaded. But after that you can't do much as a number of the SQL statements are failing as MariaDB syntax doesn't work well with PostgreSQL.
Failing unit tests are documented here:
https://github.com/keif888/photoprism/blob/PostgreSQL/teststatus.md

@keif888 commented on GitHub (Feb 26, 2025): I am working on an assumption that everything that PhotoPrism's compose.yaml does with MariaDB has to be done in PostgreSQL. So I updated the compose.postgres.yaml to be the "same" as the compose.yaml. ie. it has all the same services. I have Keycloak working in PostgreSQL now, same as it does in MariaDB. My init script for PostgreSQL was incorrect, which I have corrected. Current status is: PhotoPrism starts, you can upload a photo, and it is uploaded. But after that you can't do much as a number of the SQL statements are failing as MariaDB syntax doesn't work well with PostgreSQL. Failing unit tests are documented here: https://github.com/keif888/photoprism/blob/PostgreSQL/teststatus.md
Author
Owner

@lastzero commented on GitHub (Feb 26, 2025):

If it's custom SQL from us, then you should be able to find most of the related code (that needs to be extended) by searching for switch DbDialect(), for example:

github.com/photoprism/photoprism@4a4e45eb59/internal/entity/query/covers.go (L27-L49)

@lastzero commented on GitHub (Feb 26, 2025): If it's custom SQL from us, then you should be able to find most of the related code (that needs to be extended) by searching for `switch DbDialect()`, for example: https://github.com/photoprism/photoprism/blob/4a4e45eb59e85b6207cb9b366cecd11e1206b547/internal/entity/query/covers.go#L27-L49
Author
Owner

@keif888 commented on GitHub (Feb 26, 2025):

I've just fixed that one (just committed it.) by rewriting the SQL so that it is ANSI standard, and works against all three DBMS'.

@keif888 commented on GitHub (Feb 26, 2025): I've just fixed that one (just committed it.) by rewriting the SQL so that it is ANSI standard, and works against all three DBMS'.
Author
Owner

@lastzero commented on GitHub (Feb 26, 2025):

@keif888 I appreciate your work, but please be careful with this: If there are different queries, these MAY do different things depending on the database capabilities, e.g. select the first image vs. select the image with the highest resolution or quality. Also, it's possible that the "compatible" query for SQLite may not perform well on MariaDB. So even if it works, there could be unwanted side effects.

@lastzero commented on GitHub (Feb 26, 2025): @keif888 I appreciate your work, but please be careful with this: If there are different queries, these MAY do different things depending on the database capabilities, e.g. select the first image vs. select the image with the highest resolution or quality. Also, it's possible that the "compatible" query for SQLite may not perform well on MariaDB. So even if it works, there could be unwanted side effects.
Author
Owner

@keif888 commented on GitHub (Feb 26, 2025):

@lastzero I have reverted to individual SQL statements per DBMS to avoid the unintended consequences, and will continue to maintain that where it already exists.
Fortunately I had only completed 4 with consolidated SQL statements. The 5th one required separate statements as PostgreSQL can NOT do a MAX on a bytea column.

@keif888 commented on GitHub (Feb 26, 2025): @lastzero I have reverted to individual SQL statements per DBMS to avoid the unintended consequences, and will continue to maintain that where it already exists. Fortunately I had only completed 4 with consolidated SQL statements. The 5th one required separate statements as PostgreSQL can NOT do a MAX on a bytea column.
Author
Owner

@lastzero commented on GitHub (Feb 26, 2025):

@keif888 If SQLite works with the same standard query as PostgreSQL, feel free to combine those to avoid duplicate code:

  • Pretty much everyone with a large photo collection (where performance matters) currently uses MariaDB, while the unit tests run on SQLite, so unexpected results and database errors can be found and fixed much easier there.
  • Later, when we have time, we can check if MariaDB should also use the standard query or if it's better to keep the custom query, e.g. because it's faster.
@lastzero commented on GitHub (Feb 26, 2025): @keif888 If SQLite works with the same standard query as PostgreSQL, feel free to combine those to avoid duplicate code: - Pretty much everyone with a large photo collection (where performance matters) currently uses MariaDB, while the unit tests run on SQLite, so unexpected results and database errors can be found and fixed much easier there. - Later, when we have time, we can check if MariaDB should also use the standard query or if it's better to keep the custom query, e.g. because it's faster.
Author
Owner

@keif888 commented on GitHub (Feb 26, 2025):

I have a blocking issue, and my knowledge of PostgreSQL consists of what I have read in the documentation in the last couple of days.
MariaDB for PhotoPrism is using specific character sets and collations.
They are not deterministic, and are case insensitive.
The default collations for PostgreSQL are all deterministic.
This is causing some queries to fail.

MariaDB startup specifies the following two settings in the startup command for dealing with character strings:

--character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci

Can someone please let me know how to setup the equivalent in PostgreSQL?

@keif888 commented on GitHub (Feb 26, 2025): I have a blocking issue, and my knowledge of PostgreSQL consists of what I have read in the documentation in the last couple of days. MariaDB for PhotoPrism is using specific character sets and collations. They are not deterministic, and are case insensitive. The default collations for PostgreSQL are all deterministic. This is causing some queries to fail. MariaDB startup specifies the following two settings in the startup command for dealing with character strings: ``` --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci ``` Can someone please let me know how to setup the equivalent in PostgreSQL?
Author
Owner

@pashagolub commented on GitHub (Feb 26, 2025):

@keif888 do you need these params for a new database(s) or for the whole instance (all databases)?

We can specify default params for the whole instance, so every database created will inherit those. Or we can control those setting per database level.

Where can I find the startup command for MariaDB to guess the best choice?

@pashagolub commented on GitHub (Feb 26, 2025): @keif888 do you need these params for a new database(s) or for the whole instance (all databases)? We can specify default params for the whole instance, so every database created will inherit those. Or we can control those setting per database level. Where can I find the startup command for MariaDB to guess the best choice?
Author
Owner

@pashagolub commented on GitHub (Feb 26, 2025):

Fortunately I had only completed 4 with consolidated SQL statements. The 5th one required separate statements as PostgreSQL can NOT do a MAX on a bytea column

Postgres can do MAX but either we need to type cast explicitly, or to create a custom aggregate

pgwatch=# with vals(b) as (
  values ('one'::bytea), ('two'), ('three')
)
select max(b::text) from vals;
   max
----------
 \x74776f
(1 row)

But I feel something is terribly wrong if we're trying to get max of bytea

@pashagolub commented on GitHub (Feb 26, 2025): > Fortunately I had only completed 4 with consolidated SQL statements. The 5th one required separate statements as PostgreSQL can NOT do a MAX on a bytea column Postgres can do MAX but either we need to type cast explicitly, or to create a custom aggregate ```sql pgwatch=# with vals(b) as ( values ('one'::bytea), ('two'), ('three') ) select max(b::text) from vals; max ---------- \x74776f (1 row) ``` But I feel something is terribly wrong if we're trying to get max of bytea
Author
Owner

@keif888 commented on GitHub (Feb 26, 2025):

@pashagolub the MariaDB Docker startup is here:

command: --port=4001 --innodb-strict-mode=1 --innodb-buffer-pool-size=256M --transaction-isolation=READ-COMMITTED --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --max-connections=512 --innodb-rollback-on-timeout=OFF --innodb-lock-wait-timeout=120

The max of a bytea I solved as per this --> MAX(convert_from(m.thumb,'UTF8'))

My understanding is that MAX of a bytea is required because PhotoPrism was developed needing both case insensitive and case sensitive matching. And m.thumb (above) is an example of a case sensitive match. I used convert_from to return the value back to a string before doing the MAX so as to match what SQLite and MariaDB are doing. (I do need to retest this to make sure that it really is achieving the same result).

Where case sensitive is needed the developers used VARBINARY(), and where case insensitive is needed they used VARCHAR().
See where MarkerName is VARCHAR, and Thumb is VARBINARY.
Gorm V1

Definition DBMS Result
MarkerName string gorm:"type:VARCHAR(160);" SQLite marker_name VARCHAR(160)
MarkerName string gorm:"type:VARCHAR(160);" MariaDB marker_name VARCHAR(160)
MarkerName string gorm:"type:VARCHAR(160);" PostgreSQL marker_name VARCHAR(160)
Thumb string gorm:"type:VARBINARY(128);index;default:'';" SQLite thumb VARBINARY(128)
Thumb string gorm:"type:VARBINARY(128);index;default:'';" MariaDB thumb VARBINARY(128)
Thumb string gorm:"type:VARBINARY(128);index;default:'';" PostgreSQL error

Gorm V2

Definition DBMS Result
MarkerName string gorm:"size:160;" SQLite marker_name text
MarkerName string gorm:"size:160;" MariaDB marker_name VARCHAR(160)
MarkerName string gorm:"size:160;" PostgreSQL marker_name VARCHAR(160)
Thumb string gorm:"type:bytes;size:128;index;default:'';" SQLite thumb blob
Thumb string gorm:"type:bytes;size:128;index;default:'';" MariaDB thumb VARBINARY(128)
Thumb string gorm:"type:bytes;size:128;index;default:'';" PostgreSQL thumb bytea
@keif888 commented on GitHub (Feb 26, 2025): @pashagolub the MariaDB Docker startup is [here](https://github.com/keif888/photoprism/blob/44376ee1f6fdce9514222413140388a2e407821c/compose.yaml#L140): ``` command: --port=4001 --innodb-strict-mode=1 --innodb-buffer-pool-size=256M --transaction-isolation=READ-COMMITTED --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --max-connections=512 --innodb-rollback-on-timeout=OFF --innodb-lock-wait-timeout=120 ``` The max of a bytea I solved as per [this](https://github.com/keif888/photoprism/blob/44376ee1f6fdce9514222413140388a2e407821c/internal/entity/query/covers.go#L341) --> MAX(convert_from(m.thumb,'UTF8')) My understanding is that MAX of a bytea is required because PhotoPrism was developed needing both case insensitive and case sensitive matching. And m.thumb (above) is an example of a case sensitive match. I used convert_from to return the value back to a string before doing the MAX so as to match what SQLite and MariaDB are doing. (I do need to retest this to make sure that it really is achieving the same result). Where case sensitive is needed the developers used VARBINARY(), and where case insensitive is needed they used VARCHAR(). See [where MarkerName is VARCHAR, and Thumb is VARBINARY](https://github.com/photoprism/photoprism/blob/4a4e45eb59e85b6207cb9b366cecd11e1206b547/internal/entity/marker.go#L32). Gorm V1 | Definition | DBMS | Result | |------------------------------------------------------------------|------------|--------------------------------------------------| | MarkerName string `gorm:"type:VARCHAR(160);"` | SQLite | marker_name VARCHAR(160) | | MarkerName string `gorm:"type:VARCHAR(160);"` | MariaDB | marker_name VARCHAR(160) | | MarkerName string `gorm:"type:VARCHAR(160);"` | PostgreSQL | marker_name VARCHAR(160) | | Thumb string `gorm:"type:VARBINARY(128);index;default:'';"` | SQLite | thumb VARBINARY(128) | | Thumb string `gorm:"type:VARBINARY(128);index;default:'';"` | MariaDB | thumb VARBINARY(128) | | Thumb string `gorm:"type:VARBINARY(128);index;default:'';"` | PostgreSQL | error | Gorm V2 | Definition | DBMS | Result | |------------------------------------------------------------------|------------|--------------------------------------------------| | MarkerName string `gorm:"size:160;"` | SQLite | marker_name text | | MarkerName string `gorm:"size:160;"` | MariaDB | marker_name VARCHAR(160) | | MarkerName string `gorm:"size:160;"` | PostgreSQL | marker_name VARCHAR(160) | | Thumb string `gorm:"type:bytes;size:128;index;default:'';"` | SQLite | thumb blob | | Thumb string `gorm:"type:bytes;size:128;index;default:'';"` | MariaDB | thumb VARBINARY(128) | | Thumb string `gorm:"type:bytes;size:128;index;default:'';"` | PostgreSQL | thumb bytea |
Author
Owner

@lastzero commented on GitHub (Feb 27, 2025):

@keif888 Is this the query you are having trouble with?

github.com/photoprism/photoprism@4a4e45eb59/internal/entity/query/covers.go (L269-L283)

If so, don't worry about the MAX(), because from what I can see/remember, it's just a way to make sure that the markers.thumb value assigned to subject.thumb is deterministic and not empty (so that the thumb doesn't break or change all the time):

  1. Given enough time to think about it, there are probably other (better) ways to solve this problem (as long as the query is deterministic and no empty value is set as thumb, it doesn't seem to matter too much how the thumb is selected).

  2. It is planned (and strongly requested by our users) that the cover images can be set manually from the UI (so the queries to set them automatically will become less important), see https://github.com/photoprism/photoprism/issues/383.

You may find similar patterns elsewhere and are welcome to suggest improvements or solve the same problem in a different way for PostgreSQL/SQLite! In this case, please add a code comment so we can find the relevant queries later to check them and also use them for MariaDB if possible.

@lastzero commented on GitHub (Feb 27, 2025): @keif888 Is this the query you are having trouble with? https://github.com/photoprism/photoprism/blob/4a4e45eb59e85b6207cb9b366cecd11e1206b547/internal/entity/query/covers.go#L269-L283 If so, don't worry about the `MAX()`, because from what I can see/remember, it's just a way to make sure that the `markers.thumb` value assigned to `subject.thumb` is deterministic and not empty (so that the thumb doesn't break or change all the time): 1. Given enough time to think about it, there are probably other (better) ways to solve this problem (as long as the query is deterministic and no empty value is set as `thumb`, it doesn't seem to matter too much how the `thumb` is selected). 2. It is planned (and strongly requested by our users) that the cover images can be set manually from the UI (so the queries to set them automatically will become less important), see https://github.com/photoprism/photoprism/issues/383. *You may find similar patterns elsewhere and are welcome to suggest improvements or solve the same problem in a different way for PostgreSQL/SQLite! In this case, please add a code comment so we can find the relevant queries later to check them and also use them for MariaDB if possible.*
Author
Owner

@keif888 commented on GitHub (Feb 27, 2025):

@lastzero Yes that was the one I was looking at. That one was easy to make work as it does in the other DBMS'.
The harder one was searchPhotos as the way that MariaDB and SQLite handle GROUP BY is not the same as PostgreSQL. That is going now, and hopefully in a way that doesn't add maintenance nightmares.

I am working to get PostgreSQL working the same way that MariaDB does, before trying to refactor the way that existing SQL statements work.

There a quite a few differences between the SQL DML engines in the 3 DBMS', which makes the complex queries difficult.

BTW: I had to change the PostgreSQL version that I had chosen from 17-alpine to 16-alpine as the PhotoPrism container is using Ubuntu 16.6-0ubuntu0.24.10.1, and that was preventing backup and restore from working.

@keif888 commented on GitHub (Feb 27, 2025): @lastzero Yes that was the one I was looking at. That one was easy to make work as it does in the other DBMS'. The harder one was searchPhotos as the way that MariaDB and SQLite handle GROUP BY is not the same as PostgreSQL. That is going now, and hopefully in a way that doesn't add maintenance nightmares. I am working to get PostgreSQL working the same way that MariaDB does, before trying to refactor the way that existing SQL statements work. There a quite a few differences between the SQL DML engines in the 3 DBMS', which makes the complex queries difficult. BTW: I had to change the PostgreSQL version that I had chosen from 17-alpine to 16-alpine as the PhotoPrism container is using Ubuntu 16.6-0ubuntu0.24.10.1, and that was preventing backup and restore from working.
Author
Owner

@keif888 commented on GitHub (Feb 28, 2025):

An now another nasty issue.
After many hours I haven't found a way around this one, yet...

Gorm is returning timestamp's with the time.Local instead of time.UTC.
I have the server running as UTC, the connection string with TimeZone=UTC.
The times are added to the server correctly.
Just when GO gets them back, they have the wrong timezone attached.

eg.

photo_test.go:1397: 
     |         	Error Trace:	/go/src/github.com/photoprism/photoprism/internal/entity/photo_test.go:1397
     |         	Error:      	Not equal: 
     |         	            	expected: time.Date(2016, time.November, 11, 8, 7, 18, 0, time.UTC)
     |         	            	actual  : time.Date(2016, time.November, 11, 8, 7, 18, 0, time.Local)
     |         	            	
     |         	            	Diff:
     |         	Test:       	TestPhoto_UnscopedSearch/Ok

There is a fix for this in the pgx driver, which Gorm is using, but Gorm is unable to utilise that fix from what I can discover.

The fix is to add code similar to this. See also comment here.

conn.TypeMap().RegisterType(&pgtype.Type{
			Name:  "timestamptz",
			OID:   pgtype.TimestamptzOID,
			Codec: &pgtype.TimestamptzCodec{ScanLocation: time.UTC},
		})

BUT, that has to be done if you are using pgx directly, which Gorm doesn't.
It uses it via database/sql or via pgx's stdlib, and neither of those allow access to TypeMap().

There is a possibility that I can do something similar to this Gorm test, but it's using database/sql, and I need to use pgx directly.

I have tried changing the server's timezone, the databases timezone, the connection string's timezone. None of these change the returned value (always Local). PostgreSQL is working as designed.

There is an issue on Gorm similar to it here. It's open, but I think it has 2 issues confused. 1st issue is incorrect timezone in connection string so PsotegreSQL was changing the timestamp on the way in, and the 2nd is the one that we have with pgx marking it as Local.

As an FYI: The test works if I add a .UTC() as shown below (and as per a comment in the issue), but there is no way that is an acceptable solution.

		assert.Equal(t, photo1.TakenAt, photo.TakenAt.UTC())
@keif888 commented on GitHub (Feb 28, 2025): An now another nasty issue. After many hours I haven't found a way around this one, yet... Gorm is returning timestamp's with the time.Local instead of time.UTC. I have the server running as UTC, the connection string with TimeZone=UTC. The times are added to the server correctly. Just when GO gets them back, they have the wrong timezone attached. eg. ``` photo_test.go:1397: | Error Trace: /go/src/github.com/photoprism/photoprism/internal/entity/photo_test.go:1397 | Error: Not equal: | expected: time.Date(2016, time.November, 11, 8, 7, 18, 0, time.UTC) | actual : time.Date(2016, time.November, 11, 8, 7, 18, 0, time.Local) | | Diff: | Test: TestPhoto_UnscopedSearch/Ok ``` There is a fix for this in the pgx driver, which Gorm is using, but Gorm is unable to utilise that fix from what I can discover. The fix is to add code similar to [this](https://github.com/jackc/pgx/issues/1945). See also comment [here](https://github.com/jackc/pgx/issues/2117#issuecomment-2315376310). ``` conn.TypeMap().RegisterType(&pgtype.Type{ Name: "timestamptz", OID: pgtype.TimestamptzOID, Codec: &pgtype.TimestamptzCodec{ScanLocation: time.UTC}, }) ``` BUT, that has to be done if you are using pgx directly, which Gorm doesn't. It uses it via [database/sql](https://github.com/go-gorm/postgres/blob/c1721250b2600eaf175be7f56f58567745b1ee84/postgres.go#L90) or via [pgx's stdlib](https://github.com/go-gorm/postgres/blob/c1721250b2600eaf175be7f56f58567745b1ee84/postgres.go#L105), and neither of those allow access to TypeMap(). There is a possibility that I can do something similar to this [Gorm test](https://github.com/go-gorm/gorm/blob/9f273777f58a247f7ae4176a014f6d59ac9fff8c/tests/connpool_test.go#L100), but it's using database/sql, and I need to use pgx directly. I have tried changing the server's timezone, the databases timezone, the connection string's timezone. None of these change the returned value (always Local). PostgreSQL is working as designed. There is an issue on Gorm similar to it [here](https://github.com/go-gorm/postgres/issues/199). It's open, but I think it has 2 issues confused. 1st issue is incorrect timezone in connection string so PsotegreSQL was changing the timestamp on the way in, and the 2nd is the one that we have with pgx marking it as Local. As an FYI: The test works if I add a .UTC() as shown below (and as per a comment in the issue), but there is no way that is an acceptable solution. ``` assert.Equal(t, photo1.TakenAt, photo.TakenAt.UTC()) ```
Author
Owner

@keif888 commented on GitHub (Feb 28, 2025):

I have raised an issue against gorm for the timestamp location <> UTC.
I replicated it in the go-gorm playground.

https://github.com/go-gorm/gorm/issues/7377

@keif888 commented on GitHub (Feb 28, 2025): I have raised an issue against gorm for the timestamp location <> UTC. I replicated it in the go-gorm playground. https://github.com/go-gorm/gorm/issues/7377
Author
Owner

@keif888 commented on GitHub (Mar 1, 2025):

Good News

I worked out how to get a pgxpool into Gorm, so I have the timestamptz working as UTC now.
I've included the work around in the issue noted above, and added it to my branch. That makes the internal/entity tests all pass now.
147 fixed tests done.

Only have the collation issue to solve now, as I'm assuming that will fix the 26 api tests that failed (crossed fingers)
123 failed tests to go.

Bad News

Collation in PostgreSQL can be set at a server or database level, but.... Only if it is deterministic. And we need DETERMINISTIC=false, which can only be done on by creating a collation within a database and applying that to table columns/indexes.

These don't work

  • 16-Alpine
  • 16-Bookworm
POSTGRES_INITDB_ARGS: "--locale-provider=icu --icu-locale=und-u-ks-level2" doesn't error...

postgres-1      | Using language tag "und-u-ks-level2" for ICU locale "und-u-ks-level2".
postgres-1      | The database cluster will be initialized with this locale configuration:
postgres-1      |   provider:    icu
postgres-1      |   ICU locale:  und-u-ks-level2
postgres-1      |   LC_COLLATE:  en_US.utf8
postgres-1      |   LC_CTYPE:    en_US.utf8
postgres-1      |   LC_MESSAGES: en_US.utf8
postgres-1      |   LC_MONETARY: en_US.utf8
postgres-1      |   LC_NUMERIC:  en_US.utf8
postgres-1      |   LC_TIME:     en_US.utf8
postgres-1      | The default database encoding has accordingly been set to "UTF8".
postgres-1      | The default text search configuration will be set to "english".

BUT, it works in deterministic fashion, so Aaa <> AAA.
Ditto for:

CREATE DATABASE acceptance OWNER acceptance TEMPLATE "template0" LOCALE_PROVIDER "icu" ICU_LOCALE "und-u-ks-level2";

This is what the collation string means

und-u <-- Unicode
ks <-- case sensitive strength
level2 <-- case insensitive, accent sensitive

This works, but...

Only way, and I've read the documentation now, is to create a database, create a collation, and then on every varchar column in the database add the COLLATION.

CREATE COLLATION utf8unicodeci (PROVIDER=icu, locale='@colStrength=secondary', DETERMINISTIC=false);
-- This creates a collation that is und-u-ks-level2
create table test_coll (
col1 varchar(30)
, col2 varchar(30) COLLATE utf8unicodeci
, col3 varchar(30) COLLATE utf8unicodeci);

insert into test_coll values ('asdf','asdf','asdf'), ('asdF','asdF','asdf'), ('asdf','asdf','asdF');
select * from test_coll where col1 = 'asdf';
-- 2 records returned as col1 doesn't have a COLLATE
select * from test_coll where col2 = 'ASDF';
-- 3 records returned as col2 has the COLLATE

The problem is that Gorm doesn't have a way to do that only for PostgreSQL. Unless I modify gorm.io/driver/postgres to have an option to add the collation's when migrating. And that's a lot of reflection code that does my head in when I try and work on it.

And what I am doing about it

Other option is to change every varchar based equality/like check to be lower cased.
eg. Any query that does a like or = against structs like this, which are varchar in the databases:

	AlbumTitle       string         `gorm:"size:160;index;" json:"Title" yaml:"Title"`
	AlbumLocation    string         `gorm:"size:160;" json:"Location" yaml:"Location,omitempty"`
	AlbumCategory    string         `gorm:"size:100;index;" json:"Category" yaml:"Category,omitempty"`
	AlbumCaption     string         `gorm:"size:1024;" json:"Caption" yaml:"Caption,omitempty"`

I am working down that path now, as I can't see modifying the driver being a simple change.
But, it is making the code base messy for want of a better term.
eg.
github.com/keif888/photoprism@7867b5f1ac

	// Filter by title.
	if txt.NotEmpty(frm.Title) {
		likeString := "photos.photo_title"
		if entity.DbDialect() == entity.Postgres {
			likeString = "lower(photos.photo_title)"
		}
		where, values := OrLike(likeString, frm.Title)
		s = s.Where(where, values...)
	}

And I'm concerned that I will miss some if they are buried in First and Find etc.
We will see how the unit testing ends up.

@keif888 commented on GitHub (Mar 1, 2025): ## Good News I worked out how to get a pgxpool into Gorm, so I have the timestamptz working as UTC now. I've included the work around in the issue noted above, and added it to my branch. That makes the internal/entity tests all pass now. 147 fixed tests done. Only have the collation issue to solve now, as I'm assuming that will fix the 26 api tests that failed (crossed fingers) 123 failed tests to go. ## Bad News Collation in PostgreSQL can be set at a server or database level, but.... Only if it is deterministic. And we need DETERMINISTIC=false, which can only be done on by creating a collation within a database and applying that to table columns/indexes. ## These don't work - 16-Alpine - 16-Bookworm ``` POSTGRES_INITDB_ARGS: "--locale-provider=icu --icu-locale=und-u-ks-level2" doesn't error... postgres-1 | Using language tag "und-u-ks-level2" for ICU locale "und-u-ks-level2". postgres-1 | The database cluster will be initialized with this locale configuration: postgres-1 | provider: icu postgres-1 | ICU locale: und-u-ks-level2 postgres-1 | LC_COLLATE: en_US.utf8 postgres-1 | LC_CTYPE: en_US.utf8 postgres-1 | LC_MESSAGES: en_US.utf8 postgres-1 | LC_MONETARY: en_US.utf8 postgres-1 | LC_NUMERIC: en_US.utf8 postgres-1 | LC_TIME: en_US.utf8 postgres-1 | The default database encoding has accordingly been set to "UTF8". postgres-1 | The default text search configuration will be set to "english". ``` BUT, it works in deterministic fashion, so Aaa <> AAA. Ditto for: ``` CREATE DATABASE acceptance OWNER acceptance TEMPLATE "template0" LOCALE_PROVIDER "icu" ICU_LOCALE "und-u-ks-level2"; ``` ## This is what the collation string means und-u <-- Unicode ks <-- case sensitive strength level2 <-- case insensitive, accent sensitive ## This works, but... Only way, and I've read the documentation now, is to create a database, create a collation, and then on every varchar column in the database add the COLLATION. ``` CREATE COLLATION utf8unicodeci (PROVIDER=icu, locale='@colStrength=secondary', DETERMINISTIC=false); -- This creates a collation that is und-u-ks-level2 create table test_coll ( col1 varchar(30) , col2 varchar(30) COLLATE utf8unicodeci , col3 varchar(30) COLLATE utf8unicodeci); insert into test_coll values ('asdf','asdf','asdf'), ('asdF','asdF','asdf'), ('asdf','asdf','asdF'); select * from test_coll where col1 = 'asdf'; -- 2 records returned as col1 doesn't have a COLLATE select * from test_coll where col2 = 'ASDF'; -- 3 records returned as col2 has the COLLATE ``` The problem is that Gorm doesn't have a way to do that only for PostgreSQL. Unless I modify gorm.io/driver/postgres to have an option to add the collation's when migrating. And that's a lot of reflection code that does my head in when I try and work on it. ## And what I am doing about it Other option is to change every varchar based equality/like check to be lower cased. eg. Any query that does a like or = against structs like this, which are varchar in the databases: ``` AlbumTitle string `gorm:"size:160;index;" json:"Title" yaml:"Title"` AlbumLocation string `gorm:"size:160;" json:"Location" yaml:"Location,omitempty"` AlbumCategory string `gorm:"size:100;index;" json:"Category" yaml:"Category,omitempty"` AlbumCaption string `gorm:"size:1024;" json:"Caption" yaml:"Caption,omitempty"` ``` I am working down that path now, as I can't see modifying the driver being a simple change. But, it is making the code base messy for want of a better term. eg. https://github.com/keif888/photoprism/commit/7867b5f1ac81f0609da9a219041e78d91b129e02 ``` // Filter by title. if txt.NotEmpty(frm.Title) { likeString := "photos.photo_title" if entity.DbDialect() == entity.Postgres { likeString = "lower(photos.photo_title)" } where, values := OrLike(likeString, frm.Title) s = s.Where(where, values...) } ``` And I'm concerned that I will miss some if they are buried in First and Find etc. We will see how the unit testing ends up.
Author
Owner

@lastzero commented on GitHub (Mar 1, 2025):

@keif888 Leaving aside case-sensitivity, Unicode is inherently "non-deterministic" when it comes to comparisons, since there are (often many) different encodings for the same characters:

  • Even if you focus only on UTF-8, the encoding can be composed and decomposed, and there are equivalent characters.
  • A common solution to this is to index/store the values in canonical form, so that when you compare them, you can do so byte by byte. When you look at it this way, the comparisons are completely deterministic and straightforward, so to say they're "non-deterministic" would be misleading.
  • Only if you play dumb and store the supposed Unicode strings as bytes without any validation or normalization, you will not be able to sort/compare them properly according to Unicode rules.
  • Now, I know that MariaDB's utf8mb4_unicode_ci collation goes a step further and also supports expansions, i.e. when a character is compared as equal to combinations of other characters. But that's nice to have, and if it's slow or complicated with PostgreSQL, it's certainly something our users can live with.
@lastzero commented on GitHub (Mar 1, 2025): @keif888 Leaving aside case-sensitivity, Unicode is inherently "non-deterministic" when it comes to comparisons, since there are (often many) different encodings for the same characters: - Even if you focus only on UTF-8, the encoding can be composed and decomposed, and there are [equivalent characters](https://en.wikipedia.org/wiki/Unicode_equivalence). - A common solution to this is to index/store the values in canonical form, so that when you compare them, you can do so byte by byte. When you look at it this way, the comparisons are completely deterministic and straightforward, so to say they're "non-deterministic" would be misleading. - Only if you play dumb and store the supposed Unicode strings as bytes without any validation or normalization, you will not be able to sort/compare them properly according to Unicode rules. - Now, I know that MariaDB's `utf8mb4_unicode_ci` collation goes a step further and also supports expansions, i.e. when a character is compared as equal to combinations of other characters. But that's nice to have, and if it's slow or complicated with PostgreSQL, it's certainly something our users can live with.
Author
Owner

@keif888 commented on GitHub (Mar 2, 2025):

Status Report

I have created a draft pull request as all unit tests now pass, and PhotoPrism starts and seems to work without errors and missing photos.

PhotoPrism starts, and allows adding photos. Rescanning finds and adds photos. Search works (flower and Flower both find the same set of flowers for example.
Database encoding is UTF8.
Collation is the OOTB, which in my case defaulted to en_US.utf8 from memory.

To Do:

  • Find out how to get psql into docker without having to run the command manually every time I restart the docker container
  • Investigate Inconsistencies
  • Continue investigation of SQL errors (unique key, foreign key violations) generated from unit tests to ensure they are all from FirstOrCreate functionality, or deliberate error condition testing
  • Create performance data creation functions for PostgreSQL
  • Run performance tests
  • Rerun all unit tests against all three DBMS' just in case
@keif888 commented on GitHub (Mar 2, 2025): # Status Report I have created a draft pull request as all unit tests now pass, and PhotoPrism starts and seems to work without errors and missing photos. PhotoPrism starts, and allows adding photos. Rescanning finds and adds photos. Search works (flower and Flower both find the same set of flowers for example. Database encoding is UTF8. Collation is the OOTB, which in my case defaulted to en_US.utf8 from memory. To Do: - ~~Find out how to get psql into docker without having to run the command manually every time I restart the docker container~~ - ~~Investigate Inconsistencies~~ - ~~Continue investigation of SQL errors (unique key, foreign key violations) generated from unit tests to ensure they are all from FirstOrCreate functionality, or deliberate error condition testing~~ - ~~Create performance data creation functions for PostgreSQL~~ - ~~Run performance tests~~ - ~~Rerun all unit tests against all three DBMS' just in case~~
Author
Owner

@pashagolub commented on GitHub (Mar 2, 2025):

Find out how to get psql into docker without having to run the command manually every time I restart the docker container

What do you mean by this? psql is the part of Postgres, so it's the part of Docker container. What command do you need to run manually? Why do you need to run that command after container restart?

@pashagolub commented on GitHub (Mar 2, 2025): > Find out how to get psql into docker without having to run the command manually every time I restart the docker container What do you mean by this? psql is the part of Postgres, so it's the part of Docker container. What command do you need to run manually? Why do you need to run that command after container restart?
Author
Owner

@keif888 commented on GitHub (Mar 2, 2025):

FYI: I have fixed the issue around postgresql-client being missing. There is a need to ensure that it's included in the base photoprism:develop image though.

@pashagolub
I used the wrong terms above regards psql and containers.
To clear up any confusion.

PhotoPrism development has a number of services that are initiated from a compose.yaml file.

  • photoprism service
  • DBMS's (one or more of)
    • postgres service
    • mariadb service
  • traefik service
  • dummy-webdav service
  • dummy-oidc service
  • dummy-ldap service
  • keycloak service

The photoprism app within the photoprism service has to be able to communicate with the postgres service via command line tools for backup and restore.
Some make commands (within make terminal) which run within the photoprism service need the psql command.

Specific commands needed are:

I had a much larger post following the above, with links to everything I had done, and realised that there was an option to call specific make file targets via the PHOTOPRISM_INIT environment setting for photoprism.
So I have updated the compose.postgres.yaml adding postgresql in the list of items to init, and after rebuilding the photoprism service, it is all working.

For those interested:
I had to run the following:

docker compose -f compose.postgres.yaml build
make docker-postgresql

The docker compose build is needed to ensure that the updated makefile and scripts are included in the service, and on restart I saw the following:

photoprism-1    | init: postgresql
photoprism-1    | /scripts/install-postgresql.sh postgresql-client
photoprism-1    | Installing "postgresql-client" distribution packages for AMD64...
photoprism-1    | Get:1 https://deb.nodesource.com/node_22.x nodistro InRelease [12.1 kB]
photoprism-1    | Get:2 https://dl.google.com/linux/chrome/deb stable InRelease [1825 B]
photoprism-1    | Get:3 http://security.ubuntu.com/ubuntu oracular-security InRelease [126 kB]
photoprism-1    | Get:4 http://archive.ubuntu.com/ubuntu oracular InRelease [265 kB]
photoprism-1    | Get:5 https://deb.nodesource.com/node_22.x nodistro/main amd64 Packages [5206 B]
photoprism-1    | Get:6 https://dl.google.com/linux/chrome/deb stable/main amd64 Packages [1211 B]
photoprism-1    | Get:7 http://security.ubuntu.com/ubuntu oracular-security/universe amd64 Packages [167 kB]
photoprism-1    | Get:8 http://archive.ubuntu.com/ubuntu oracular-updates InRelease [126 kB]
photoprism-1    | Get:9 http://security.ubuntu.com/ubuntu oracular-security/main amd64 Packages [277 kB]
photoprism-1    | Get:10 http://archive.ubuntu.com/ubuntu oracular-backports InRelease [126 kB]
photoprism-1    | Get:11 http://security.ubuntu.com/ubuntu oracular-security/multiverse amd64 Packages [10.4 kB]
photoprism-1    | Get:12 http://security.ubuntu.com/ubuntu oracular-security/restricted amd64 Packages [142 kB]
photoprism-1    | Get:13 http://archive.ubuntu.com/ubuntu oracular/main amd64 Packages [1835 kB]
photoprism-1    | Get:14 http://archive.ubuntu.com/ubuntu oracular/universe amd64 Packages [19.6 MB]
photoprism-1    | Get:15 http://archive.ubuntu.com/ubuntu oracular/multiverse amd64 Packages [308 kB]
photoprism-1    | Get:16 http://archive.ubuntu.com/ubuntu oracular/restricted amd64 Packages [67.0 kB]
photoprism-1    | Get:17 http://archive.ubuntu.com/ubuntu oracular-updates/main amd64 Packages [392 kB]
photoprism-1    | Get:18 http://archive.ubuntu.com/ubuntu oracular-updates/restricted amd64 Packages [148 kB]
photoprism-1    | Get:19 http://archive.ubuntu.com/ubuntu oracular-updates/universe amd64 Packages [232 kB]
photoprism-1    | Get:20 http://archive.ubuntu.com/ubuntu oracular-updates/multiverse amd64 Packages [11.4 kB]
photoprism-1    | Get:21 http://archive.ubuntu.com/ubuntu oracular-backports/universe amd64 Packages [5417 B]
photoprism-1    | Fetched 23.8 MB in 8s (2943 kB/s)
photoprism-1    | Reading package lists...
photoprism-1    | debconf: unable to initialize frontend: Dialog
photoprism-1    | debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
photoprism-1    | debconf: falling back to frontend: Readline
photoprism-1    | debconf: unable to initialize frontend: Readline
photoprism-1    | debconf: (This frontend requires a controlling tty.)
photoprism-1    | debconf: falling back to frontend: Teletype
photoprism-1    | dpkg-preconfigure: unable to re-open stdin: 
photoprism-1    | Selecting previously unselected package libpq5:amd64.
(Reading database ... 53983 files and directories currently installed.)
photoprism-1    | Preparing to unpack .../libpq5_16.6-0ubuntu0.24.10.1_amd64.deb ...
photoprism-1    | Unpacking libpq5:amd64 (16.6-0ubuntu0.24.10.1) ...
photoprism-1    | Selecting previously unselected package postgresql-client-common.
photoprism-1    | Preparing to unpack .../postgresql-client-common_262_all.deb ...
photoprism-1    | Unpacking postgresql-client-common (262) ...
photoprism-1    | Selecting previously unselected package postgresql-client-16.
photoprism-1    | Preparing to unpack .../postgresql-client-16_16.6-0ubuntu0.24.10.1_amd64.deb ...
photoprism-1    | Unpacking postgresql-client-16 (16.6-0ubuntu0.24.10.1) ...
photoprism-1    | Selecting previously unselected package postgresql-client.
photoprism-1    | Preparing to unpack .../postgresql-client_16+262_all.deb ...
photoprism-1    | Unpacking postgresql-client (16+262) ...
photoprism-1    | Setting up postgresql-client-common (262) ...
photoprism-1    | Setting up libpq5:amd64 (16.6-0ubuntu0.24.10.1) ...
photoprism-1    | Setting up postgresql-client-16 (16.6-0ubuntu0.24.10.1) ...
photoprism-1    | update-alternatives: using /usr/share/postgresql/16/man/man1/psql.1.gz to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode
photoprism-1    | Setting up postgresql-client (16+262) ...
photoprism-1    | Processing triggers for libc-bin (2.40-1ubuntu3.1) ...
photoprism-1    | Done.
@keif888 commented on GitHub (Mar 2, 2025): FYI: I have fixed the issue around postgresql-client being missing. There is a need to ensure that it's included in the base photoprism:develop image though. @pashagolub I used the wrong terms above regards psql and containers. To clear up any confusion. PhotoPrism development has a number of services that are initiated from a compose.yaml file. - photoprism service - DBMS's (one or more of) - postgres service - mariadb service - traefik service - dummy-webdav service - dummy-oidc service - dummy-ldap service - keycloak service The photoprism app within the photoprism service has to be able to communicate with the postgres service via command line tools for backup and restore. Some make commands (within make terminal) which run within the photoprism service need the psql command. Specific commands needed are: - pg_restore for the [restore of database](https://github.com/keif888/photoprism/blob/31e4c8ce021e05f413de28ac09f75239ae414f11/internal/config/config_storage.go#L650) commands - pg_backup for the [backup of database](https://github.com/keif888/photoprism/blob/31e4c8ce021e05f413de28ac09f75239ae414f11/internal/config/config_storage.go#L655) commands - psql for the [make commands](https://github.com/keif888/photoprism/blob/31e4c8ce021e05f413de28ac09f75239ae414f11/Makefile#L825) to reset the databases for development/testing purposes I had a much larger post following the above, with links to everything I had done, and realised that there was an option to call specific make file targets via the PHOTOPRISM_INIT environment setting for photoprism. So I have updated the compose.postgres.yaml adding postgresql in the list of items to init, and after rebuilding the photoprism service, it is all working. For those interested: I had to run the following: ``` docker compose -f compose.postgres.yaml build make docker-postgresql ``` The docker compose build is needed to ensure that the updated makefile and scripts are included in the service, and on restart I saw the following: ``` photoprism-1 | init: postgresql photoprism-1 | /scripts/install-postgresql.sh postgresql-client photoprism-1 | Installing "postgresql-client" distribution packages for AMD64... photoprism-1 | Get:1 https://deb.nodesource.com/node_22.x nodistro InRelease [12.1 kB] photoprism-1 | Get:2 https://dl.google.com/linux/chrome/deb stable InRelease [1825 B] photoprism-1 | Get:3 http://security.ubuntu.com/ubuntu oracular-security InRelease [126 kB] photoprism-1 | Get:4 http://archive.ubuntu.com/ubuntu oracular InRelease [265 kB] photoprism-1 | Get:5 https://deb.nodesource.com/node_22.x nodistro/main amd64 Packages [5206 B] photoprism-1 | Get:6 https://dl.google.com/linux/chrome/deb stable/main amd64 Packages [1211 B] photoprism-1 | Get:7 http://security.ubuntu.com/ubuntu oracular-security/universe amd64 Packages [167 kB] photoprism-1 | Get:8 http://archive.ubuntu.com/ubuntu oracular-updates InRelease [126 kB] photoprism-1 | Get:9 http://security.ubuntu.com/ubuntu oracular-security/main amd64 Packages [277 kB] photoprism-1 | Get:10 http://archive.ubuntu.com/ubuntu oracular-backports InRelease [126 kB] photoprism-1 | Get:11 http://security.ubuntu.com/ubuntu oracular-security/multiverse amd64 Packages [10.4 kB] photoprism-1 | Get:12 http://security.ubuntu.com/ubuntu oracular-security/restricted amd64 Packages [142 kB] photoprism-1 | Get:13 http://archive.ubuntu.com/ubuntu oracular/main amd64 Packages [1835 kB] photoprism-1 | Get:14 http://archive.ubuntu.com/ubuntu oracular/universe amd64 Packages [19.6 MB] photoprism-1 | Get:15 http://archive.ubuntu.com/ubuntu oracular/multiverse amd64 Packages [308 kB] photoprism-1 | Get:16 http://archive.ubuntu.com/ubuntu oracular/restricted amd64 Packages [67.0 kB] photoprism-1 | Get:17 http://archive.ubuntu.com/ubuntu oracular-updates/main amd64 Packages [392 kB] photoprism-1 | Get:18 http://archive.ubuntu.com/ubuntu oracular-updates/restricted amd64 Packages [148 kB] photoprism-1 | Get:19 http://archive.ubuntu.com/ubuntu oracular-updates/universe amd64 Packages [232 kB] photoprism-1 | Get:20 http://archive.ubuntu.com/ubuntu oracular-updates/multiverse amd64 Packages [11.4 kB] photoprism-1 | Get:21 http://archive.ubuntu.com/ubuntu oracular-backports/universe amd64 Packages [5417 B] photoprism-1 | Fetched 23.8 MB in 8s (2943 kB/s) photoprism-1 | Reading package lists... photoprism-1 | debconf: unable to initialize frontend: Dialog photoprism-1 | debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.) photoprism-1 | debconf: falling back to frontend: Readline photoprism-1 | debconf: unable to initialize frontend: Readline photoprism-1 | debconf: (This frontend requires a controlling tty.) photoprism-1 | debconf: falling back to frontend: Teletype photoprism-1 | dpkg-preconfigure: unable to re-open stdin: photoprism-1 | Selecting previously unselected package libpq5:amd64. (Reading database ... 53983 files and directories currently installed.) photoprism-1 | Preparing to unpack .../libpq5_16.6-0ubuntu0.24.10.1_amd64.deb ... photoprism-1 | Unpacking libpq5:amd64 (16.6-0ubuntu0.24.10.1) ... photoprism-1 | Selecting previously unselected package postgresql-client-common. photoprism-1 | Preparing to unpack .../postgresql-client-common_262_all.deb ... photoprism-1 | Unpacking postgresql-client-common (262) ... photoprism-1 | Selecting previously unselected package postgresql-client-16. photoprism-1 | Preparing to unpack .../postgresql-client-16_16.6-0ubuntu0.24.10.1_amd64.deb ... photoprism-1 | Unpacking postgresql-client-16 (16.6-0ubuntu0.24.10.1) ... photoprism-1 | Selecting previously unselected package postgresql-client. photoprism-1 | Preparing to unpack .../postgresql-client_16+262_all.deb ... photoprism-1 | Unpacking postgresql-client (16+262) ... photoprism-1 | Setting up postgresql-client-common (262) ... photoprism-1 | Setting up libpq5:amd64 (16.6-0ubuntu0.24.10.1) ... photoprism-1 | Setting up postgresql-client-16 (16.6-0ubuntu0.24.10.1) ... photoprism-1 | update-alternatives: using /usr/share/postgresql/16/man/man1/psql.1.gz to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode photoprism-1 | Setting up postgresql-client (16+262) ... photoprism-1 | Processing triggers for libc-bin (2.40-1ubuntu3.1) ... photoprism-1 | Done. ```
Author
Owner

@pashagolub commented on GitHub (Mar 3, 2025):

BTW: I had to change the PostgreSQL version that I had chosen from 17-alpine to 16-alpine as the PhotoPrism container is using Ubuntu 16.6-0ubuntu0.24.10.1, and that was preventing backup and restore from working.

Sorry. I'm trying to catch up but you're too fast for me. :) Would you please elaborate on this? Because I don't see Ubuntu 16 here

@pashagolub commented on GitHub (Mar 3, 2025): > BTW: I had to change the PostgreSQL version that I had chosen from 17-alpine to 16-alpine as the PhotoPrism container is using Ubuntu 16.6-0ubuntu0.24.10.1, and that was preventing backup and restore from working. Sorry. I'm trying to catch up but you're too fast for me. :) Would you please elaborate on this? Because I don't see Ubuntu 16 [here](https://github.com/photoprism/photoprism/blob/develop/Dockerfile)
Author
Owner

@pashagolub commented on GitHub (Mar 3, 2025):

Another thing I want to emphasize. It's better to use a Postgres packages to install packages not system shipped, e.g.

...
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
    && apt-get -qy install curl gnupg postgresql-common apt-transport-https lsb-release \
    && sh /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh -y \
    && curl -L "https://www.postgresql.org/media/keys/ACCC4CF8.asc" | apt-key add - \
    && apt-get -qy install postgresql-17 postgresql-plpython3-17 postgresql-17-pg-qualstats \
    && apt-get purge -y --auto-remove \
    && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
...
@pashagolub commented on GitHub (Mar 3, 2025): Another thing I want to emphasize. It's better to use a [Postgres packages](https://wiki.postgresql.org/bwiki/Apt) to install packages not system shipped, e.g. ```dockerfile ... ENV DEBIAN_FRONTEND=noninteractive RUN apt-get update \ && apt-get -qy install curl gnupg postgresql-common apt-transport-https lsb-release \ && sh /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh -y \ && curl -L "https://www.postgresql.org/media/keys/ACCC4CF8.asc" | apt-key add - \ && apt-get -qy install postgresql-17 postgresql-plpython3-17 postgresql-17-pg-qualstats \ && apt-get purge -y --auto-remove \ && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* ... ```
Author
Owner

@keif888 commented on GitHub (Mar 3, 2025):

Hi,

I have the output of the os version and psql version from my photoprism service below.

OS Version

photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.10"
NAME="Ubuntu"
VERSION_ID="24.10"
VERSION="24.10 (Oracular Oriole)"
VERSION_CODENAME=oracular
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=oracular
LOGO=ubuntu-logo

psql version

photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ psql --version
psql (PostgreSQL) 16.6 (Ubuntu 16.6-0ubuntu0.24.10.1)
photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ 

Ubuntu includes the postgresql-client in their list of packages (I'm probably mangling the reality here, I'm not a linux expert), but it's the 16.6 version. And that can not connect to the alpine-17 postgresql service as it's a lower version.
So I chose to use alpine-16 instead. It's still a supported version.
When the ubuntu version is updated for the photoprism service, then (assuming that ubuntu has changed their postgresql version) the compose.postgres.yaml file can be updated to alpine-17.

@keif888 commented on GitHub (Mar 3, 2025): Hi, I have the output of the os version and psql version from my photoprism service below. OS Version ``` photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ cat /etc/os-release PRETTY_NAME="Ubuntu 24.10" NAME="Ubuntu" VERSION_ID="24.10" VERSION="24.10 (Oracular Oriole)" VERSION_CODENAME=oracular ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=oracular LOGO=ubuntu-logo ``` psql version ``` photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ psql --version psql (PostgreSQL) 16.6 (Ubuntu 16.6-0ubuntu0.24.10.1) photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ ``` Ubuntu includes the postgresql-client in their list of packages (I'm probably mangling the reality here, I'm not a linux expert), but it's the 16.6 version. And that can not connect to the alpine-17 postgresql service as it's a lower version. So I chose to use alpine-16 instead. It's still a supported version. When the ubuntu version is updated for the photoprism service, then (assuming that ubuntu has changed their postgresql version) the compose.postgres.yaml file can be updated to alpine-17.
Author
Owner

@pashagolub commented on GitHub (Mar 3, 2025):

I believe, we don't need to rely on any packages shipped with OS. We are in charge of what and how to be used. That said, it's simple to specify the exact version of client we want to have

@pashagolub commented on GitHub (Mar 3, 2025): I believe, we don't need to rely on any packages shipped with OS. We are in charge of what and how to be used. That said, it's simple to specify the exact version of client we want to have
Author
Owner

@keif888 commented on GitHub (Mar 3, 2025):

Yes, and that part I would leave to people that know what they are doing. I know that you have to deal with keys to allow apt to reference other repositories, but then it all starts getting a bit fuzzy. I am in no way an expert on linux or postgresql.

@keif888 commented on GitHub (Mar 3, 2025): Yes, and that part I would leave to people that know what they are doing. I know that you have to deal with keys to allow apt to reference other repositories, but then it all starts getting a bit fuzzy. I am in no way an expert on linux or postgresql.
Author
Owner

@lastzero commented on GitHub (Mar 3, 2025):

There should already be an install script in /scripts/dist for MariaDB binaries, so you could also add one for PostgreSQL with a fallback to the default system packages.

@lastzero commented on GitHub (Mar 3, 2025): There should already be an install script in /scripts/dist for MariaDB binaries, so you could also add one for PostgreSQL with a fallback to the default system packages.
Author
Owner

@keif888 commented on GitHub (Mar 3, 2025):

Preliminary benchmarks of sqlite vs postgres, mariadb vs postres and sqlite vs mariadb.
I create a database of 100k randomly generated photos for each of the DBMS, and then execute some benchmarks against it.
All databases are managed by gorm, so they have the same table, foreign key and index structures.
There is no tuning of the postgres service (I haven't read up on how to do that yet).
Overall postgresql is faster than sqlite, and slower than mariadb.

photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.sqlite.txt storage/Benchmark100k.postgres.txt 
goos: linux
goarch: amd64
pkg: github.com/photoprism/photoprism/internal/performancetest
cpu: AMD Ryzen 7 5700X 8-Core Processor             
                                │ storage/Benchmark100k.sqlite.txt │   storage/Benchmark100k.postgres.txt   │
                                │              sec/op              │    sec/op      vs base                 │
100k/CreateDeleteAlbum-4                              5.036m ±  6%   28.101m ±  1%  +458.02% (p=0.000 n=10)
100k/ListAlbums-4                                     287.5m ±  6%    179.1m ±  5%   -37.69% (p=0.000 n=10)
100k/CreateDeleteCamera-4                             3.766m ± 11%    1.286m ±  6%   -65.86% (p=0.000 n=10)
100k/CreateDeleteCellAndPlace-4                       7.165m ± 16%   14.739m ±  3%  +105.70% (p=0.000 n=10)
100k/FileRegenerateIndex-4                            4.790m ±  7%    1.762m ±  3%   -63.21% (p=0.000 n=10)
100k/CreateDeletePhoto-4                              60.58m ±  7%    43.29m ±  7%   -28.55% (p=0.000 n=10)
100k/ListPhotos-4                                    291.25m ±  3%    74.86m ± 11%   -74.30% (p=0.000 n=10)
geomean                                               22.90m          17.70m         -22.69%
photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.sqlite.txt storage/Benchmark100k.mariadb.txt 
goos: linux
goarch: amd64
pkg: github.com/photoprism/photoprism/internal/performancetest
cpu: AMD Ryzen 7 5700X 8-Core Processor             
                                │ storage/Benchmark100k.sqlite.txt │  storage/Benchmark100k.mariadb.txt   │
                                │              sec/op              │    sec/op     vs base                │
100k/CreateDeleteAlbum-4                              5.036m ±  6%   2.658m ±  4%  -47.22% (p=0.000 n=10)
100k/ListAlbums-4                                     287.5m ±  6%   113.2m ±  3%  -60.62% (p=0.000 n=10)
100k/CreateDeleteCamera-4                             3.766m ± 11%   1.521m ±  4%  -59.61% (p=0.000 n=10)
100k/CreateDeleteCellAndPlace-4                       7.165m ± 16%   3.744m ±  7%  -47.75% (p=0.000 n=10)
100k/FileRegenerateIndex-4                            4.790m ±  7%   1.257m ± 11%  -73.76% (p=0.000 n=10)
100k/CreateDeletePhoto-4                              60.58m ±  7%   32.33m ±  2%  -46.63% (p=0.000 n=10)
100k/ListPhotos-4                                     291.2m ±  3%   478.6m ±  5%  +64.31% (p=0.000 n=10)
geomean                                               22.90m         11.88m        -48.14%
photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.mariadb.txt storage/Benchmark100k.postgres.txt 
goos: linux
goarch: amd64
pkg: github.com/photoprism/photoprism/internal/performancetest
cpu: AMD Ryzen 7 5700X 8-Core Processor             
                                │ storage/Benchmark100k.mariadb.txt │   storage/Benchmark100k.postgres.txt   │
                                │              sec/op               │    sec/op      vs base                 │
100k/CreateDeleteAlbum-4                               2.658m ±  4%   28.101m ±  1%  +957.27% (p=0.000 n=10)
100k/ListAlbums-4                                      113.2m ±  3%    179.1m ±  5%   +58.21% (p=0.000 n=10)
100k/CreateDeleteCamera-4                              1.521m ±  4%    1.286m ±  6%   -15.46% (p=0.000 n=10)
100k/CreateDeleteCellAndPlace-4                        3.744m ±  7%   14.739m ±  3%  +293.66% (p=0.000 n=10)
100k/FileRegenerateIndex-4                             1.257m ± 11%    1.762m ±  3%   +40.24% (p=0.000 n=10)
100k/CreateDeletePhoto-4                               32.33m ±  2%    43.29m ±  7%   +33.88% (p=0.000 n=10)
100k/ListPhotos-4                                     478.56m ±  5%    74.86m ± 11%   -84.36% (p=0.000 n=10)
geomean                                                11.88m          17.70m         +49.06%
photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ 
@keif888 commented on GitHub (Mar 3, 2025): Preliminary benchmarks of sqlite vs postgres, mariadb vs postres and sqlite vs mariadb. I create a database of 100k randomly generated photos for each of the DBMS, and then execute some benchmarks against it. All databases are managed by gorm, so they have the same table, foreign key and index structures. There is no tuning of the postgres service (I haven't read up on how to do that yet). Overall postgresql is faster than sqlite, and slower than mariadb. ``` photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.sqlite.txt storage/Benchmark100k.postgres.txt goos: linux goarch: amd64 pkg: github.com/photoprism/photoprism/internal/performancetest cpu: AMD Ryzen 7 5700X 8-Core Processor │ storage/Benchmark100k.sqlite.txt │ storage/Benchmark100k.postgres.txt │ │ sec/op │ sec/op vs base │ 100k/CreateDeleteAlbum-4 5.036m ± 6% 28.101m ± 1% +458.02% (p=0.000 n=10) 100k/ListAlbums-4 287.5m ± 6% 179.1m ± 5% -37.69% (p=0.000 n=10) 100k/CreateDeleteCamera-4 3.766m ± 11% 1.286m ± 6% -65.86% (p=0.000 n=10) 100k/CreateDeleteCellAndPlace-4 7.165m ± 16% 14.739m ± 3% +105.70% (p=0.000 n=10) 100k/FileRegenerateIndex-4 4.790m ± 7% 1.762m ± 3% -63.21% (p=0.000 n=10) 100k/CreateDeletePhoto-4 60.58m ± 7% 43.29m ± 7% -28.55% (p=0.000 n=10) 100k/ListPhotos-4 291.25m ± 3% 74.86m ± 11% -74.30% (p=0.000 n=10) geomean 22.90m 17.70m -22.69% photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.sqlite.txt storage/Benchmark100k.mariadb.txt goos: linux goarch: amd64 pkg: github.com/photoprism/photoprism/internal/performancetest cpu: AMD Ryzen 7 5700X 8-Core Processor │ storage/Benchmark100k.sqlite.txt │ storage/Benchmark100k.mariadb.txt │ │ sec/op │ sec/op vs base │ 100k/CreateDeleteAlbum-4 5.036m ± 6% 2.658m ± 4% -47.22% (p=0.000 n=10) 100k/ListAlbums-4 287.5m ± 6% 113.2m ± 3% -60.62% (p=0.000 n=10) 100k/CreateDeleteCamera-4 3.766m ± 11% 1.521m ± 4% -59.61% (p=0.000 n=10) 100k/CreateDeleteCellAndPlace-4 7.165m ± 16% 3.744m ± 7% -47.75% (p=0.000 n=10) 100k/FileRegenerateIndex-4 4.790m ± 7% 1.257m ± 11% -73.76% (p=0.000 n=10) 100k/CreateDeletePhoto-4 60.58m ± 7% 32.33m ± 2% -46.63% (p=0.000 n=10) 100k/ListPhotos-4 291.2m ± 3% 478.6m ± 5% +64.31% (p=0.000 n=10) geomean 22.90m 11.88m -48.14% photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.mariadb.txt storage/Benchmark100k.postgres.txt goos: linux goarch: amd64 pkg: github.com/photoprism/photoprism/internal/performancetest cpu: AMD Ryzen 7 5700X 8-Core Processor │ storage/Benchmark100k.mariadb.txt │ storage/Benchmark100k.postgres.txt │ │ sec/op │ sec/op vs base │ 100k/CreateDeleteAlbum-4 2.658m ± 4% 28.101m ± 1% +957.27% (p=0.000 n=10) 100k/ListAlbums-4 113.2m ± 3% 179.1m ± 5% +58.21% (p=0.000 n=10) 100k/CreateDeleteCamera-4 1.521m ± 4% 1.286m ± 6% -15.46% (p=0.000 n=10) 100k/CreateDeleteCellAndPlace-4 3.744m ± 7% 14.739m ± 3% +293.66% (p=0.000 n=10) 100k/FileRegenerateIndex-4 1.257m ± 11% 1.762m ± 3% +40.24% (p=0.000 n=10) 100k/CreateDeletePhoto-4 32.33m ± 2% 43.29m ± 7% +33.88% (p=0.000 n=10) 100k/ListPhotos-4 478.56m ± 5% 74.86m ± 11% -84.36% (p=0.000 n=10) geomean 11.88m 17.70m +49.06% photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ ```
Author
Owner

@keif888 commented on GitHub (Mar 3, 2025):

There should already be an install script in /scripts/dist for MariaDB binaries, so you could also add one for PostgreSQL with a fallback to the default system packages.

I cloned that to create one for PostgreSQL.
I have updated it to get the latest version of postgresql-client, and updated the yaml file to 17-alpine.
https://github.com/keif888/photoprism/blob/PostgreSQL/scripts/dist/install-postgresql.sh
https://github.com/keif888/photoprism/blob/PostgreSQL/compose.postgres.yaml

Performance comparison:

photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.postgres.txt storage/Benchmark100k.postgres-17.txt 
goos: linux
goarch: amd64
pkg: github.com/photoprism/photoprism/internal/performancetest
cpu: AMD Ryzen 7 5700X 8-Core Processor             
                                │ storage/Benchmark100k.postgres.txt │ storage/Benchmark100k.postgres-17.txt │
                                │               sec/op               │    sec/op      vs base                │
100k/CreateDeleteAlbum-4                                28.10m ±  1%    23.37m ±  1%  -16.85% (p=0.000 n=10)
100k/ListAlbums-4                                       179.1m ±  5%    183.7m ±  1%        ~ (p=0.075 n=10)
100k/CreateDeleteCamera-4                               1.286m ±  6%    1.286m ± 16%        ~ (p=0.796 n=10)
100k/CreateDeleteCellAndPlace-4                         14.74m ±  3%    15.94m ± 16%   +8.16% (p=0.023 n=10)
100k/FileRegenerateIndex-4                              1.762m ±  3%    1.869m ±  8%   +6.02% (p=0.043 n=10)
100k/CreateDeletePhoto-4                                43.29m ±  7%    44.80m ±  8%        ~ (p=0.143 n=10)
100k/ListPhotos-4                                       74.86m ± 11%    70.73m ± 17%        ~ (p=0.165 n=10)
geomean                                                 17.70m          17.59m         -0.63%
@keif888 commented on GitHub (Mar 3, 2025): > There should already be an install script in /scripts/dist for MariaDB binaries, so you could also add one for PostgreSQL with a fallback to the default system packages. I cloned that to create one for PostgreSQL. I have updated it to get the latest version of postgresql-client, and updated the yaml file to 17-alpine. https://github.com/keif888/photoprism/blob/PostgreSQL/scripts/dist/install-postgresql.sh https://github.com/keif888/photoprism/blob/PostgreSQL/compose.postgres.yaml Performance comparison: ``` photoprism@250221-oracular:/go/src/github.com/photoprism/photoprism$ go run golang.org/x/perf/cmd/benchstat storage/Benchmark100k.postgres.txt storage/Benchmark100k.postgres-17.txt goos: linux goarch: amd64 pkg: github.com/photoprism/photoprism/internal/performancetest cpu: AMD Ryzen 7 5700X 8-Core Processor │ storage/Benchmark100k.postgres.txt │ storage/Benchmark100k.postgres-17.txt │ │ sec/op │ sec/op vs base │ 100k/CreateDeleteAlbum-4 28.10m ± 1% 23.37m ± 1% -16.85% (p=0.000 n=10) 100k/ListAlbums-4 179.1m ± 5% 183.7m ± 1% ~ (p=0.075 n=10) 100k/CreateDeleteCamera-4 1.286m ± 6% 1.286m ± 16% ~ (p=0.796 n=10) 100k/CreateDeleteCellAndPlace-4 14.74m ± 3% 15.94m ± 16% +8.16% (p=0.023 n=10) 100k/FileRegenerateIndex-4 1.762m ± 3% 1.869m ± 8% +6.02% (p=0.043 n=10) 100k/CreateDeletePhoto-4 43.29m ± 7% 44.80m ± 8% ~ (p=0.143 n=10) 100k/ListPhotos-4 74.86m ± 11% 70.73m ± 17% ~ (p=0.165 n=10) geomean 17.70m 17.59m -0.63% ```
Author
Owner

@keif888 commented on GitHub (Mar 7, 2025):

Status Report

All unit tests pass.
Latest fixes from gorm2 branch merged in.
No unexpected SQL/gorm errors are being reported.
Inconsistency with MariaDB is an issue within gorm. It does execute the update, but doesn't report the number of records affected correctly.
Pull is ready for review.

@keif888 commented on GitHub (Mar 7, 2025): ## Status Report All unit tests pass. Latest fixes from gorm2 branch merged in. No unexpected SQL/gorm errors are being reported. Inconsistency with MariaDB is an issue within gorm. It does execute the update, but doesn't report the number of records affected correctly. Pull is ready for review.
Author
Owner

@haozhou commented on GitHub (May 8, 2025):

Do we have any guide to migrate data from mysql/mariadb to postgresql?

@haozhou commented on GitHub (May 8, 2025): Do we have any guide to migrate data from mysql/mariadb to postgresql?
Author
Owner

@keif888 commented on GitHub (May 9, 2025):

Do we have any guide to migrate data from mysql/mariadb to postgresql?

Not at this time.
I can start having a look at it so that something similar to https://docs.photoprism.app/getting-started/advanced/migrations/sqlite-to-mariadb/ can be produced.

My initial take on it would be as follows, but I will have to determine appropriate commands, and ensure that it's all achievable:

  • backup MariaDB/SQLite existing database
  • stop the stack
  • add PostgreSQL container to docker
  • set PostgreSQL as the database for Photoprism to use
  • start the stack
    • This will create a new photoprism database with the correct structure in PostgreSQL
  • stop photoprism, but leave stack running
  • backup the migrations table from PostgreSQL
  • execute pgloader to move data from SQLite/MariaDB to PostgreSQL with truncate
  • execute pgloader to restore migrations table
  • start photoprism
  • verify all is working
  • stop the stack
  • remove the MariaDB container from docker
  • start the stack
@keif888 commented on GitHub (May 9, 2025): > Do we have any guide to migrate data from mysql/mariadb to postgresql? Not at this time. I can start having a look at it so that something similar to https://docs.photoprism.app/getting-started/advanced/migrations/sqlite-to-mariadb/ can be produced. My initial take on it would be as follows, but I will have to determine appropriate commands, and ensure that it's all achievable: - backup MariaDB/SQLite existing database - stop the stack - add PostgreSQL container to docker - set PostgreSQL as the database for Photoprism to use - start the stack - This will create a new photoprism database with the correct structure in PostgreSQL - stop photoprism, but leave stack running - backup the migrations table from PostgreSQL - execute pgloader to move data from SQLite/MariaDB to PostgreSQL with truncate - execute pgloader to restore migrations table - start photoprism - verify all is working - stop the stack - remove the MariaDB container from docker - start the stack
Author
Owner

@keif888 commented on GitHub (May 15, 2025):

Coming soon, a new option for the migrations command line in PhotoPrism:

./photoprism migrations transfer help
NAME:
   photoprism migrations transfer - Executes database data transfers

USAGE:
   photoprism migrations transfer [command options] [migrations...]

OPTIONS:
   --force, -f  truncate target tables if populated (default: false)
   --trace, -t  show trace logs for debugging (default: false)
   --help, -h   show help

This new command will create the tables in a new database or migrate the tables to the latest version and truncate them (force option), and migrate all the data across from the currently configured database into the target.
The target can be any dbms that PhotoPrism supports (SQLite, MariaDB and PostgreSQL).
It uses a set of cli flags to provide the target dbms information.

   --transfer-driver DRIVER, --tfr-db DRIVER                database DRIVER (sqlite, mysql) (default: "sqlite") [$PHOTOPRISM_TRANSFER_DRIVER]
   --transfer-dsn DSN, --tfr-dsn DSN                        database connection DSN (sqlite file, optional for mysql) [$PHOTOPRISM_TRANSFER_DSN]
   --transfer-name NAME, --tfr-db-name NAME                 database schema NAME (default: "photoprism") [$PHOTOPRISM_TRANSFER_NAME]
   --transfer-server HOST, --tfr-db-server HOST             database HOST incl. port e.g. "mariadb:3306" (or socket path) [$PHOTOPRISM_TRANSFER_SERVER]
   --transfer-user NAME, --tfr-db-user NAME                 database user NAME (default: "photoprism") [$PHOTOPRISM_TRANSFER_USER]
   --transfer-password PASSWORD, --tfr-db-pass PASSWORD     database user PASSWORD [$PHOTOPRISM_TRANSFER_PASSWORD]
@keif888 commented on GitHub (May 15, 2025): Coming soon, a new option for the migrations command line in PhotoPrism: ``` ./photoprism migrations transfer help NAME: photoprism migrations transfer - Executes database data transfers USAGE: photoprism migrations transfer [command options] [migrations...] OPTIONS: --force, -f truncate target tables if populated (default: false) --trace, -t show trace logs for debugging (default: false) --help, -h show help ``` This new command will create the tables in a new database or migrate the tables to the latest version and truncate them (force option), and migrate all the data across from the currently configured database into the target. The target can be any dbms that PhotoPrism supports (SQLite, MariaDB and PostgreSQL). It uses a set of cli flags to provide the target dbms information. ``` --transfer-driver DRIVER, --tfr-db DRIVER database DRIVER (sqlite, mysql) (default: "sqlite") [$PHOTOPRISM_TRANSFER_DRIVER] --transfer-dsn DSN, --tfr-dsn DSN database connection DSN (sqlite file, optional for mysql) [$PHOTOPRISM_TRANSFER_DSN] --transfer-name NAME, --tfr-db-name NAME database schema NAME (default: "photoprism") [$PHOTOPRISM_TRANSFER_NAME] --transfer-server HOST, --tfr-db-server HOST database HOST incl. port e.g. "mariadb:3306" (or socket path) [$PHOTOPRISM_TRANSFER_SERVER] --transfer-user NAME, --tfr-db-user NAME database user NAME (default: "photoprism") [$PHOTOPRISM_TRANSFER_USER] --transfer-password PASSWORD, --tfr-db-pass PASSWORD database user PASSWORD [$PHOTOPRISM_TRANSFER_PASSWORD] ```
Author
Owner

@haozhou commented on GitHub (May 15, 2025):

This is great. Previously my concerns of manually backup -> restore to a new (different) RDBMS was the compatibility of the schema (what if some column has been renamed during the version evolution that doesn't have a corresponding match in the new target RDBMS) I can't wait for the release of this command.

@haozhou commented on GitHub (May 15, 2025): This is great. Previously my concerns of manually backup -> restore to a new (different) RDBMS was the compatibility of the schema (what if some column has been renamed during the version evolution that doesn't have a corresponding match in the new target RDBMS) I can't wait for the release of this command.
Author
Owner

@halkeye commented on GitHub (May 20, 2025):

I was thinking about setting up photoprism again and stumbled upon this. I see at least https://github.com/photoprism/photoprism/pull/4831 isn't merged yet. Is there anything us randos can do to help test or code or otherwise help out? I know personally I have better infra/scripts for setting up postgres dbs instead of mysql so i'm pretty excited.

@halkeye commented on GitHub (May 20, 2025): I was thinking about setting up photoprism again and stumbled upon this. I see at least https://github.com/photoprism/photoprism/pull/4831 isn't merged yet. Is there anything us randos can do to help test or code or otherwise help out? I know personally I have better infra/scripts for setting up postgres dbs instead of mysql so i'm pretty excited.
Author
Owner

@apavelm commented on GitHub (Jun 25, 2025):

Any ETA when this PR will be completed and feature released?
Thanks

@apavelm commented on GitHub (Jun 25, 2025): Any ETA when this PR will be completed and feature released? Thanks
Author
Owner

@graciousgrey commented on GitHub (Jul 30, 2025):

Here's a quick update on this issue: Our team is currently finishing work on the batch edit feature and other PRs. Once that's done, we'll review the latest changes and build a new PostgreSQL Docker image, so you can help with testing!

@graciousgrey commented on GitHub (Jul 30, 2025): Here's a quick update on this issue: Our team is currently finishing work on the batch edit feature and other PRs. Once that's done, we'll review the latest changes and build a new PostgreSQL Docker image, so you can help with testing!
Author
Owner

@keif888 commented on GitHub (Aug 27, 2025):

Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability.
This is up to date with the develop branch as at August 24th 2025 commit e80c1e1.

docker pull photoprism/photoprism:250827-postgres

The documentation on the migration command which allows transferring your index database between DBMS' is here, as it hasn't been merged into the live documentation yet.

The documentation includes sample PostgreSQL yaml configuration to add to your compose.yaml file.

It would be great if you could help us test this please.

FYI:
I perform my testing by snapshotting all my MariaDB based PhotoPrism "drives", and mounting those snapshots as new "drives" and using those snapshots for testing via updated volumes.
See https://docs.photoprism.app/getting-started/advanced/docker-volumes/

@keif888 commented on GitHub (Aug 27, 2025): Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit e80c1e1. ``` docker pull photoprism/photoprism:250827-postgres ``` The documentation on the migration command which allows transferring your index database between DBMS' is [here](https://github.com/keif888/photoprism-docs/blob/PostgreSQL/docs/getting-started/advanced/migrations/database-to-database.md), as it hasn't been merged into the live documentation yet. The documentation includes sample PostgreSQL yaml configuration to add to your compose.yaml file. It would be great if you could help us test this please. FYI: I perform my testing by snapshotting all my MariaDB based PhotoPrism "drives", and mounting those snapshots as new "drives" and using those snapshots for testing via updated volumes. See https://docs.photoprism.app/getting-started/advanced/docker-volumes/
Author
Owner

@iconoclasthero commented on GitHub (Oct 8, 2025):

Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit e80c1e1.

docker pull photoprism/photoprism:250827-postgres

The documentation includes sample PostgreSQL yaml configuration to add to your compose.yaml file.

It would be great if you could help us test this please.

FWIW, my psql is bare metal and it took me a while to get it to work. As I just set photoprism up, I abandoned the old DB rather than try to import it into psql so i cannot comment on that.

Some hints for setting up on a bare-metal psql server:

Changes to the docker-compose.yaml:


#    image: photoprism/photoprism:latest
    image: photoprism/photoprism:250827-postgres
  ⋮
      # PHOTOPRISM_DATABASE_DRIVER: "sqlite"         # SQLite is an embedded database that does not require a 
separate database server
#      PHOTOPRISM_DATABASE_DRIVER: "mysql"            # MariaDB 10.5.12+ (MySQL successor) offers significantly 
better performance compared to SQLite
#      PHOTOPRISM_DATABASE_SERVER: "mariadb:3306"     # MariaDB database server (hostname:port)
      PHOTOPRISM_DATABASE_DRIVER: "postgres"
      PHOTOPRISM_DATABASE_DSN: "postgresql://<psqluser>:<psqlpassword escaped>@<lan address:5432/photoprism?sslmode=disable"  
      PHOTOPRISM_DATABASE_NAME: "<psql db name>"        
      PHOTOPRISM_DATABASE_USER: "<psql user name"         
      PHOTOPRISM_DATABASE_PASSWORD: "<psql password>"       # database password

Needed to create the proper user/password/database/permissions in psql.
Made sure that the docker bridge was allowed in /etc/postgresql/17/main/pg_hba.conf
I also had to open up the UFW/iptables firewall rules in order to get it to connect.

Maybe some other things as well, but those're the main things I remember.

@iconoclasthero commented on GitHub (Oct 8, 2025): > Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit [e80c1e1](https://github.com/photoprism/photoprism/commit/e80c1e1df7bbae3db304c60333fe009267368a5f). > > ``` > docker pull photoprism/photoprism:250827-postgres > ``` > > The documentation includes sample PostgreSQL yaml configuration to add to your compose.yaml file. > > It would be great if you could help us test this please. > FWIW, my psql is bare metal and it took me a while to get it to work. As I just set photoprism up, I abandoned the old DB rather than try to import it into psql so i cannot comment on that. #### Some hints for setting up on a bare-metal psql server: Changes to the docker-compose.yaml: ``` # image: photoprism/photoprism:latest image: photoprism/photoprism:250827-postgres ⋮ # PHOTOPRISM_DATABASE_DRIVER: "sqlite" # SQLite is an embedded database that does not require a separate database server # PHOTOPRISM_DATABASE_DRIVER: "mysql" # MariaDB 10.5.12+ (MySQL successor) offers significantly better performance compared to SQLite # PHOTOPRISM_DATABASE_SERVER: "mariadb:3306" # MariaDB database server (hostname:port) PHOTOPRISM_DATABASE_DRIVER: "postgres" PHOTOPRISM_DATABASE_DSN: "postgresql://<psqluser>:<psqlpassword escaped>@<lan address:5432/photoprism?sslmode=disable" PHOTOPRISM_DATABASE_NAME: "<psql db name>" PHOTOPRISM_DATABASE_USER: "<psql user name" PHOTOPRISM_DATABASE_PASSWORD: "<psql password>" # database password ``` Needed to create the proper user/password/database/permissions in psql. Made sure that the docker bridge was allowed in `/etc/postgresql/17/main/pg_hba.conf` I also had to open up the UFW/iptables firewall rules in order to get it to connect. Maybe some other things as well, but those're the main things I remember.
Author
Owner

@celebilercem commented on GitHub (Nov 14, 2025):

Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit e80c1e1.

docker pull photoprism/photoprism:250827-postgres

@keif888 Thanks! I see only amd64 images are being released under the postgres tag. Any plans for arm64?

@celebilercem commented on GitHub (Nov 14, 2025): > Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit [e80c1e1](https://github.com/photoprism/photoprism/commit/e80c1e1df7bbae3db304c60333fe009267368a5f). > > ``` > docker pull photoprism/photoprism:250827-postgres > ``` > @keif888 Thanks! I see only `amd64` images are being released under the `postgres` tag. Any plans for `arm64`?
Author
Owner

@keif888 commented on GitHub (Nov 15, 2025):

Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit e80c1e1.

docker pull photoprism/photoprism:250827-postgres

@keif888 Thanks! I see only amd64 images are being released under the postgres tag. Any plans for arm64?

@lastzero would have to build that and push it to docker.
Worst case, arm64 will be available when this is merged into the develop branch.

@keif888 commented on GitHub (Nov 15, 2025): > > Just letting you all know that lastzero has published a docker build with the Gorm2 and PostgreSQL capability. This is up to date with the develop branch as at August 24th 2025 commit [e80c1e1](https://github.com/photoprism/photoprism/commit/e80c1e1df7bbae3db304c60333fe009267368a5f). > > ``` > > docker pull photoprism/photoprism:250827-postgres > > ``` > > [@keif888](https://github.com/keif888) Thanks! I see only `amd64` images are being released under the `postgres` tag. Any plans for `arm64`? @lastzero would have to build that and push it to docker. Worst case, `arm64` will be available when this is merged into the develop branch.
Author
Owner

@graciousgrey commented on GitHub (Nov 25, 2025):

@celebilercem We are currently busy finalizing a new stable release. Once that's done, @lastzero can build a PostgreSQL test image for ARM64.

@graciousgrey commented on GitHub (Nov 25, 2025): @celebilercem We are currently busy finalizing a new stable release. Once that's done, @lastzero can build a PostgreSQL test image for ARM64.
Author
Owner

@davidandreoletti commented on GitHub (Jan 20, 2026):

@lastzero do you want to share any eventual progress since @graciousgrey 's above comment ?

@davidandreoletti commented on GitHub (Jan 20, 2026): @lastzero do you want to share any eventual progress since @graciousgrey 's above comment ?
Author
Owner

@lastzero commented on GitHub (Jan 21, 2026):

@davidandreoletti Thanks for the reminder! I've just merged the latest changes to our feature/postgres branch, adjusted the Makefile target, and started a new multi-arch Docker build for both linux/amd64 and linux/arm64:

👉 https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres

Please let us know if the new image works for you! 👌

@keif888 While running the unit tests, I noticed the following errors:

FAIL | MigrationCommand (261.03s)
FAIL |   MigrationCommand/TargetPopulated (61.10s)
     |         migrations_test.go:145: 
     |             	Error Trace:	/go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:145
     |             	Error:      	"Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01'" does not contain "migrate: transfer target database is not empty"
     |             	Test:       	TestMigrationCommand/TargetPopulated
FAIL |   MigrationCommand/TargetPopulatedBatch500 (61.14s)
     |         migrations_test.go:201: 
     |             	Error Trace:	/go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:201
     |             	Error:      	"Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01'" does not contain "migrate: transfer target database is not empty"
     |             	Test:       	TestMigrationCommand/TargetPopulatedBatch500
FAIL |   MigrationCommand/MySQLtoPostgreSQL (0.01s)
     |         migrations_test.go:227: exit status 1
FAIL |   MigrationCommand/MySQLtoSQLite (0.00s)
     |         migrations_test.go:336: exit status 1
FAIL |   MigrationCommand/MySQLtoSQLitePopulated (0.02s)
     |         migrations_test.go:447: exit status 1
FAIL |   MigrationCommand/PostgreSQLtoMySQL (61.52s)
     |         migrations_test.go:596: 
     |             	Error Trace:	/go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:596
     |             	Error:      	Received unexpected error:
     |             	            	Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01'
     |             	Test:       	TestMigrationCommand/PostgreSQLtoMySQL
FAIL |   MigrationCommand/SQLiteToMySQL (60.14s)
     |         migrations_test.go:815: 
     |             	Error Trace:	/go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:815
     |             	Error:      	Received unexpected error:
     |             	            	Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01'
     |             	Test:       	TestMigrationCommand/SQLiteToMySQL
FAIL | 	github.com/photoprism/photoprism/internal/commands	609.599s     

This may not actually be a problem, as the errors could be related to my pre-existing MariaDB database or user accounts? 🤔

@lastzero commented on GitHub (Jan 21, 2026): @davidandreoletti Thanks for the reminder! I've just merged the [latest changes](https://github.com/keif888/photoprism/commits/PostgreSQL) to our [`feature/postgres`](https://github.com/photoprism/photoprism/tree/feature/postgresql) branch, adjusted the `Makefile` target, and started a new multi-arch Docker build for both `linux/amd64` **and** `linux/arm64`: 👉 https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres Please let us know if the new image works for you! 👌 @keif888 While running the unit tests, I noticed the following errors: ``` FAIL | MigrationCommand (261.03s) FAIL | MigrationCommand/TargetPopulated (61.10s) | migrations_test.go:145: | Error Trace: /go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:145 | Error: "Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01'" does not contain "migrate: transfer target database is not empty" | Test: TestMigrationCommand/TargetPopulated FAIL | MigrationCommand/TargetPopulatedBatch500 (61.14s) | migrations_test.go:201: | Error Trace: /go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:201 | Error: "Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01'" does not contain "migrate: transfer target database is not empty" | Test: TestMigrationCommand/TargetPopulatedBatch500 FAIL | MigrationCommand/MySQLtoPostgreSQL (0.01s) | migrations_test.go:227: exit status 1 FAIL | MigrationCommand/MySQLtoSQLite (0.00s) | migrations_test.go:336: exit status 1 FAIL | MigrationCommand/MySQLtoSQLitePopulated (0.02s) | migrations_test.go:447: exit status 1 FAIL | MigrationCommand/PostgreSQLtoMySQL (61.52s) | migrations_test.go:596: | Error Trace: /go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:596 | Error: Received unexpected error: | Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01' | Test: TestMigrationCommand/PostgreSQLtoMySQL FAIL | MigrationCommand/SQLiteToMySQL (60.14s) | migrations_test.go:815: | Error Trace: /go/src/github.com/photoprism/photoprism/internal/commands/migrations_test.go:815 | Error: Received unexpected error: | Error 1044 (42000): Access denied for user 'migrate'@'%' to database 'migrate_01' | Test: TestMigrationCommand/SQLiteToMySQL FAIL | github.com/photoprism/photoprism/internal/commands 609.599s ``` This may not actually be a problem, as the errors could be related to my pre-existing MariaDB database or user accounts? 🤔
Author
Owner

@lastzero commented on GitHub (Jan 29, 2026):

I merged the latest changes to our feature/postgres branch and started a new multi-arch Docker build for further testing:

👉 https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres

Any feedback would be much appreciated!

@lastzero commented on GitHub (Jan 29, 2026): I merged the [latest changes](https://github.com/keif888/photoprism/commits/PostgreSQL) to our [feature/postgres](https://github.com/photoprism/photoprism/tree/feature/postgresql) branch and started a new multi-arch Docker build for further testing: 👉 https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres Any feedback would be much appreciated!
Author
Owner

@lastzero commented on GitHub (Feb 13, 2026):

Updated PostgreSQL preview builds are available for testing on Docker Hub:

@lastzero commented on GitHub (Feb 13, 2026): Updated PostgreSQL preview builds are available for testing on Docker Hub: - https://hub.docker.com/r/photoprism/photoprism/tags?name=postgres
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/photoprism#45
No description provided.