mirror of
https://github.com/Chocobozzz/PeerTube.git
synced 2026-03-02 22:57:11 -05:00
Recommendation algorithm set up #6064
Labels
No labels
Component: Accessibility
Component: Administration
Component: Auth
Component: CLI
Component: Channels
Component: Chapters
Component: Comments
Component: Custom Markdown
Component: Docker 🐳
Component: Documentation 📚
Component: Email
Component: Embed
Component: Federation 🎡
Component: Import/Export
Component: Live
Component: Metadata
Component: Mobile
Component: Moderation :godmode:
Component: Notifications
Component: Object storage
Component: Observability
Component: PeerTube Plugin 📦
Component: Player ⏯️
Component: Playlist
Component: Recommendation
Component: Redundancy
Component: Registration
Component: Runners
Component: SEO
Component: Search
Component: Security
Component: Stats
Component: Studio
Component: Studio
Component: Subscriptions
Component: Subtitles 💬
Component: Transcoding
Component: Upload
Component: Video Import
Component: i18n 🔡
Priority: High
Priority: Low
Priority: Roadmap
Status: Blocked ✋
Status: In Progress 🔜
Status: To Reproduce
Status: Waiting for answer
Template not filled
Type: Bug 🐛
Type: Discussion 💭
Type: Discussion 💭
Type: Duplicate ➿
Type: Feature Request ✨
Type: Maintenance 👷♀️
Type: Performance
Type: Question
UI
good first issue
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/PeerTube#6064
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @solidheron on GitHub (Apr 10, 2025).
Describe the problem to be solved
tl;dr: I'm proposing the addition of two elements to a video's .json value: Video_description_vector_history and Video_description_vector. The first element records the history of all descriptions submitted for a video, and the second represents what an instance decides to share from that description.
I've been thinking about recommendation algorithms for PeerTube and what that would entail. I'm currently in the research and brainstorming phase. One idea I've come up with is that viewers could run a recommendation algorithm locally on their own machines, using information from the API and possibly the video content itself. For now, I'm calling the description of a video—such as the category it belongs to—a Video_description_vector. This will be used to help shape a user vector, which will guide what videos are recommended. I haven't fully developed the algorithm yet, but I anticipate that in the future there will be a variety of algorithms, both local and instance-based.
All algorithms, however, will need a video descriptor. I'm starting to form my own standard for describing videos, but I recognize that someone else may eventually come up with a better standard. Therefore, I want to suggest adding two .json elements: Video_description_vector and Video_description_vector_history, as shown below.
Video_description_vector is a simple element that an instance can share upon request. Each sub-element represents a different standard for describing a video. In my example, I include my own proposed standard and an imaginary future standard to illustrate how multiple standards could coexist.
Video_description_vector_history records who submitted a categorical recommendation for a video to an instance. That’s why each entry includes the sub-elements "name" and "host" to indicate the source of the submission. The "uuid" identifies which video the recommendation applies to. Of course, an instance can choose whom to trust—or retroactively decide to disregard certain sources.
Below is a made-up example. It’s about the Cleveland Browns, an American football team, doing a fundraiser for charity. The isTrue element lists what categories the video belongs to, while the isFalse element lists what categories it does not belong to—subjectively, of course. Categories like “football” or “sports” are excluded because the Browns are not actually playing football or engaging in sports in the video.
{
"Video_description_vector": {
"recomended_standandard": {
"isTrue": ["Browns", "Charity", "Cleveland", "ALS","fund raiser"],
"isFalse": ["Sports", "football", "Cincinatti"]
},
"future_standand": {
"doesnt": {
"subarray": { "example": "no" },
"exist": ["Sports", "football", "Cincinatti"]
}
}
},
"Video_description_vector_history": [
{
"name": "vidchase",
"host": "videovortex.tv",
"submitted_date": "11/15/2020",
"uuid": "this and everything above can be removed if inside a video.json",
"recomended_standandard": {
"isTrue": ["Browns", "Charity", "Cleveland"],
"isFalse": ["Sports", "football", "Cincinatti"]
}
},
{
"name": "composite",
"host": "combined.instance",
"submitted_date": "4/15/2024",
"uuid": "this and everything above can be removed if inside a video.json",
"recomended_standandard": {
"isTrue": ["fund raiser", "Charity", "ALS"]
}
},
{
"name": "Troll",
"host": "wrong.info",
"submitted_date": "6/9/0420",
"uuid": "example",
"recomended_standand": {
"isTrue": ["sack", "ballz"]
}
},
{
"name": "GoodFaith_but_wrong",
"host": "other.instance",
"submitted_date": "1/14/2023",
"uuid": "example",
"recomended_standand": {
"isTrue": ["Browns", "NFL", "football"]
}
}
]
}
Describe the solution you would like
No response
@MadMan247 commented on GitHub (May 7, 2025):
Would this require this json file to be added to peertube itself, or is this something that could be implemented into a plugin?
@solidheron commented on GitHub (May 7, 2025):
i thought about it it can be any number of ways. it can be one instance that does video recommendation vector and each peertube user can simply make a requests to that instance or every peertube instance can have this feature.
in my browser extension i just put the vector inside json entry that exist for each peertube video. (just keeping it simple)
this is more proposal/whitepaper for the basis of an easy-to-run ethical algorithm for peertube/fediverse
@coding-crying commented on GitHub (Nov 7, 2025):
This tag-based approach seems pretty practical. I've been experimenting with something similar for ActivityPub and your metadata proposal actually sidesteps the hardest part - you don't need to deal with processing video files directly.
What I've been trying is converting the tag vectors to text ("This video is about Browns, Charity, Cleveland...") and running them through a small embedding model (all-MiniLM-L6-v2, ~80MB). Combined that with some per-user personalization LoRAs and basic collaborative filtering, it might work decently for recommendations without needing much infrastructure. At least in theory - I haven't tested it at any real scale yet.
Main unknowns are how well the tag-based approach captures enough signal compared to actual video content, and whether the collaborative filtering works across small instance populations. Cold start is definitely an issue too. Your vector history with source tracking could help with tag quality though.
If you want to experiment with it, I've got some prototype code for Mastodon that could probably be adapted. No idea if it'll work well for video recommendations specifically, but might be worth testing.