Metadata: Improve XMP parser to support more tags #1409

Open
opened 2026-02-20 00:12:08 -05:00 by deekerman · 4 comments
Owner

Originally created by @lastzero on GitHub (Apr 17, 2022).

Originally assigned to: @lastzero on GitHub.

As a user with a lot of metadata in XMP sidecar files, I want PhotoPrism to index more of that information so I can easily view and search it.

yq is a portable YAML, JSON and XML command line processor that should make reading XMP much easier than the current implementation: https://github.com/mikefarah/yq

Developer Guide > XMP:

Related Issues:

Originally created by @lastzero on GitHub (Apr 17, 2022). Originally assigned to: @lastzero on GitHub. **As a user with a lot of metadata in XMP sidecar files, I want PhotoPrism to index more of that information so I can easily view and search it.** *yq* is a portable YAML, JSON and XML command line processor that should make reading XMP much easier than the current implementation: https://github.com/mikefarah/yq Developer Guide > XMP: - https://docs.photoprism.app/developer-guide/metadata/xmp/ Related Issues: - https://github.com/photoprism/photoprism/issues/2075 - https://github.com/photoprism/photoprism/issues/1828 - https://github.com/photoprism/photoprism/issues/1570 - https://github.com/photoprism/photoprism/issues/554 - https://github.com/photoprism/photoprism/issues/402 - https://github.com/photoprism/photoprism/issues/4106
Author
Owner

@jmalm commented on GitHub (Apr 18, 2022):

Have you considered using ExifTool in some way for reading / writing image metadata (both from the images and to/from XMP sidecar files)? For example using https://github.com/barasher/go-exiftool? (go-exiftool has a GPL 3 license, which might not be compatible with the licensing strategy of Photoprism. ExifTool, on the other hand, looks like it has a very permissive license.)

My impression is that ExifTool has become something of a standard reference tool when it comes to image metadata, so it may help even more than yq.

For reference, Librephotos does it this way. (I implemented writing to XMP sidecar files and made some smaller changes to the reading as well in that project, but the choice of ExifTool was made earlier.)

@jmalm commented on GitHub (Apr 18, 2022): Have you considered using ExifTool in some way for reading / writing image metadata (both from the images and to/from XMP sidecar files)? For example using https://github.com/barasher/go-exiftool? (go-exiftool has a GPL 3 license, which might not be compatible with the licensing strategy of Photoprism. ExifTool, on the other hand, looks like it has a very permissive license.) My impression is that ExifTool has become something of a standard reference tool when it comes to image metadata, so it may help even more than yq. For reference, Librephotos does it this way. (I implemented writing to XMP sidecar files and made some smaller changes to the reading as well in that project, but the choice of ExifTool was made earlier.)
Author
Owner

@lastzero commented on GitHub (Apr 18, 2022):

We already use Exiftool, but it's an external Perl script and XML isn't hard to parse as such. It's just that the built-in XML support in Go is pretty bad, at least last time I checked. See my notes in the Developer Guide.

@lastzero commented on GitHub (Apr 18, 2022): We already use Exiftool, but it's an external Perl script and XML isn't hard to parse as such. It's just that the built-in XML support in Go is pretty bad, at least last time I checked. See my notes in the Developer Guide.
Author
Owner

@jmalm commented on GitHub (Apr 20, 2022):

(The following is some testing and reasoning...)

I guess one of the nice things with using ExifTool is that you don't have to care as much about the structure of the tags? I.e. you can read rating by exiftool -Rating FILENAME and write it by exiftool -Rating=4 FILENAME in an XMP sidecar like the following:

<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='Image::ExifTool 10.10'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>

 <rdf:Description rdf:about=''
  xmlns:dc='http://purl.org/dc/elements/1.1/'>
  <dc:creator>
   <rdf:Seq>
    <rdf:li>Jakob Malm</rdf:li>
   </rdf:Seq>
  </dc:creator>
  <dc:rights>
   <rdf:Alt>
    <rdf:li xml:lang='x-default'>Jakob Malm</rdf:li>
   </rdf:Alt>
  </dc:rights>
 </rdf:Description>

 <rdf:Description rdf:about=''
  xmlns:xmp='http://ns.adobe.com/xap/1.0/'>
  <xmp:Rating>4</xmp:Rating>
 </rdf:Description>
</rdf:RDF>
</x:xmpmeta>

With yq command line, reading would be something like yq -p=xml ".xmpmeta.RDF.Description[] | select(.Rating) | .Rating" FILENAME.
Writing could be accomplished by yq -p=xml -o=xml "(.xmpmeta.RDF.Description[] | select(.Rating) | .Rating) = 4" FILENAME, but it seems the entire file is rewritten (I guess that is probably the case with exiftool too), without namespaces:

<xmpmeta x="adobe:ns:meta/" xmptk="Image::ExifTool 12.16">
  <RDF rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <Description about="" dc="http://purl.org/dc/elements/1.1/">
      <creator>
        <Seq>
          <li>Jakob Malm</li>
        </Seq>
      </creator>
      <rights>
        <Alt>
          <li lang="x-default">Jakob Malm</li>
        </Alt>
      </rights>
    </Description>
    <Description about="" xmp="http://ns.adobe.com/xap/1.0/">
      <Rating>4</Rating>
    </Description>
  </RDF>
</xmpmeta>

I haven't found a way to not have to specify (and know!) the "absolute" path to the tag. This may be ok and perhaps even desired, but I think the missing namespaces might be a problem.

On the other hand, two nice things with using yq would be

  • being able to compile the functionality into the application (via yqlib)
  • being able to use the same library for XMP, YAML and JSON sidecar and metadata files.
@jmalm commented on GitHub (Apr 20, 2022): (The following is some testing and reasoning...) I guess one of the nice things with using ExifTool is that you don't have to care as much about the structure of the tags? I.e. you can read rating by `exiftool -Rating FILENAME` and write it by `exiftool -Rating=4 FILENAME` in an XMP sidecar like the following: ```xml <x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='Image::ExifTool 10.10'> <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'> <rdf:Description rdf:about='' xmlns:dc='http://purl.org/dc/elements/1.1/'> <dc:creator> <rdf:Seq> <rdf:li>Jakob Malm</rdf:li> </rdf:Seq> </dc:creator> <dc:rights> <rdf:Alt> <rdf:li xml:lang='x-default'>Jakob Malm</rdf:li> </rdf:Alt> </dc:rights> </rdf:Description> <rdf:Description rdf:about='' xmlns:xmp='http://ns.adobe.com/xap/1.0/'> <xmp:Rating>4</xmp:Rating> </rdf:Description> </rdf:RDF> </x:xmpmeta> ``` With yq command line, reading would be something like `yq -p=xml ".xmpmeta.RDF.Description[] | select(.Rating) | .Rating" FILENAME`. Writing could be accomplished by `yq -p=xml -o=xml "(.xmpmeta.RDF.Description[] | select(.Rating) | .Rating) = 4" FILENAME`, but it seems the entire file is rewritten (I guess that is probably the case with exiftool too), **without namespaces**: ```xml <xmpmeta x="adobe:ns:meta/" xmptk="Image::ExifTool 12.16"> <RDF rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Description about="" dc="http://purl.org/dc/elements/1.1/"> <creator> <Seq> <li>Jakob Malm</li> </Seq> </creator> <rights> <Alt> <li lang="x-default">Jakob Malm</li> </Alt> </rights> </Description> <Description about="" xmp="http://ns.adobe.com/xap/1.0/"> <Rating>4</Rating> </Description> </RDF> </xmpmeta> ``` I haven't found a way to not have to specify (and know!) the "absolute" path to the tag. This may be ok and perhaps even desired, but I think the missing namespaces might be a problem. On the _other_ hand, two nice things with using `yq` would be - being able to compile the functionality into the application (via `yqlib`) - being able to use the same library for XMP, YAML and JSON sidecar and metadata files.
Author
Owner

@lastzero commented on GitHub (Jul 23, 2023):

An alternative library to take a look at is https://github.com/beevik/etree

XMP-related issues that may depend on this:

@lastzero commented on GitHub (Jul 23, 2023): An alternative library to take a look at is https://github.com/beevik/etree XMP-related issues that may depend on this: - #554 - #1570 - #1873 - #2828
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/photoprism#1409
No description provided.