mirror of
https://github.com/qbittorrent/qBittorrent.git
synced 2026-03-02 22:57:32 -05:00
4.2.5 overwrites files if file names are the same #10444
Labels
No labels
Accessibility
AppImage
Bounty
Build system
CI
Can't reproduce
Code cleanup
Confirmed bug
Confirmed bug
Core
Crash
Data loss
Discussion
Docker
Documentation
Duplicate
Feature
Feature request
Feature request
Feature request
Filters
Flatpak
GUI
Has workaround
I2P
Invalid
Libtorrent
Look and feel
Meta
NSIS
Network
Not an issue
OS: *BSD
OS: Linux
OS: Windows
OS: macOS
PPA
Performance
Project management
Proxy/VPN
Qt bugs
Qt6 compat
RSS
Search engine
Security
Temp folder
Themes
Translations
Triggers
Waiting diagnosis
Waiting info
Waiting upstream
Waiting web implementation
Watched folders
WebAPI
WebUI
autoCloseOldIssue
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/qBittorrent#10444
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ned-martin on GitHub (May 15, 2020).
qBittorrent version and Operating System
4.2.5x64 / Windows 10 (10.0.1xxxx)
What is the problem
When "Create subfolder for torrents with multiple files" is enabled, and more than one torrent of the same name, and/or containing one or more files of the same name, begins to download, it overwrites the other one, resulted in lost data, and the torrent re-downloading again (which then repeats the process, the torrents would likely never complete)
What is the expected behavior
The automatically-created subfolder should use common collision avoidance naming rules, for example, Folder, Folder (1), Folder (2)
Alternatively, if that's somehow impossible, a much worse solution would be that any new torrent could go into an error-state when it tries to start by detecting a name collision with an existing file and allow the user to choose to rectify the problem in some manner, by renaming the torrent or file/s for example.
Steps to reproduce
(I don't know if all of these options are required, but this is what I have set)
Enable "create subfolder for torrents with multiple/ files"
Enable "Append .!qB extension to incomplete files"
Default torrent management mode: Automatic
Default save path:
Keep incomplete torrents in:
I'm aware this issue has been reported before and is not specific to 4.2.5, but it seems to still persist and is a fundamental issue preventing the normal use of the program, that I can only assume would be trivial to fix.
@glassez commented on GitHub (May 15, 2020):
"Create subfolder" does not create anything other than what is contained in the torrent.
Isn't the user telling qBittorrent to save torrent to a specific location? Why should someone else be responsible for this action?
@glassez commented on GitHub (May 15, 2020):
I wonder what the probability is that different torrents have content with the same names...
@ned-martin commented on GitHub (May 15, 2020):
I don't understand what the confusion is here. No, the user is not telling qBittorrent to save to a specific location, as you said yourself, the location is stored in the torrent (or is the name of the torrent). In fact, in my experience, until the torrent starts downloading, the user can't even rename the location - because until the torrent gets the data of the files in it, there's no way to know.
No program should overwrite user data. Surely that's just a given? Would you be ok with any other program doing this? Let's say you're using your internet browser, and you decide to download a picture. It doesn't prompt you for any file collision, and instead just overwrites some other important photo you already had saved. Would you be ok with that? No, of course not. You'd be very angry. Well, I'm not ok with qBittorrent doing it either.
Very high. It happens all the time. Personally, I had it happen to three different torrents yesterday, losing several GB of data which had taken many hours to download, which is why I posted this.
Here's an example that demonstrates how this happens, and why this is a common problem:
There appears to be two ways this goes wrong:
Note: all the torrents I had problems with yesterday seemed to contain folders themselves, so the sub-folder that was created was contained within the torrent - it was not auto-created from the torrent name. I am unsure if this is a relevant factor. The problem is the same though - qBittorrent should not overwrite files or folders - it should auto-rename them, or a worse solution, put that torrent into an error state and pause it until the user can come fix the problem.
@FranciscoPombal commented on GitHub (May 16, 2020):
Well, I guess qBittorrent should not silently overwrite data like this without at least some kind of warning just because the user changed an option that does not really imply that such a destructive action could happen.
@glassez commented on GitHub (May 17, 2020):
What "option" do you talk about?
The problem is the user add multiple torrents with same-name files/folders to be stored in the same location.
No, I'm not OK with it. But it's incorrect example.
Pointing torrent to the location with existing files is regular case. There can be files previously downloaded (fully or partially) by the same or other torrent or even obtained from other sources but matching (fully or partially) the files in the torrent so that the user may want to use them...
@FranciscoPombal commented on GitHub (May 17, 2020):
@glassez
Yes, allowing existing files to be is used supports use cases such as "cross-seeding" for example. But why not ask the user first? That way its still possible to use existing files for "cross-seeding" and also prevent overwriting when that is not desirable.
@ned-martin commented on GitHub (May 18, 2020):
If you want to allow existing files to be "used" by a torrent, then you should add some mechanism to do so - I'd suggest a menu to "import data to existing torrent" or similar.
Either way, this is not really the same situation.
In my situation, all torrents download to one location. All completed torrents are then moved to another location. That's pretty much the only options the UI gives as far as I can tell, but it seems to also be the only logical way to do things.
Here is a simple scenario to indicate why this is a big problem, which I believe is quite common:
@nokti commented on GitHub (May 18, 2020):
This is a situation that has happened to me several times, so it's not an uncommon occurence. If my memory si correct µTorrent 2.2.1 solves that problem by automatically adding a numerical to identical folders, like Folder (1) (or was it Folder.1). It would be very useful if qBit could do the same.
@conkerts commented on GitHub (Jul 19, 2020):
Actually weird, that I searched for same filenames, but haven't found this bug report in the search.
I'm actually surprised that there are several resports of that especially in the last few months.
Yes, looks like it is indeed more common than I expected.
At least some warning for the user would be nice and useful
I mean there is no real problem if the files are supposed to be the same. But then only let one copy be actively downloaded.
In case with same file name and different checksum, just refuse to proceed or do it the utorrent way, yes, that sounds like an easy solution, too
@ned-martin commented on GitHub (Jul 19, 2020):
No, this would not be good. I often want to download the same file multiple times simultaneously. Just act like every other software on earth, and let the user choose what they want to download, without automatically overwriting anything else!
Two examples:
1: Downloading 2 same files on purpose:
2: Downloading 2 same files by accident
@conkerts commented on GitHub (Jul 20, 2020):
Yeah, well I agree with you on the most parts, esp. your previously mentioned scenario happened exactly to me.
The point is what I meant:
I have no clue about the libtorrent library behind it. But I never encountered something like cross seeding torrent files, etc.
I don't think that libtorrent is able to download and write to the same file from multiple torrent files /swarms. But that would be actually a super cool thing, now that I think about it ...
That would sort of require the torrent to sync / rescan the file every time a new part got written or something like that ... whatever, I think that would actually belong somewhere into libtorrent development.
The easiest solution is to just create a new folder and put the next torrent into the new folder, to avoid any conflicts. Then yes, the same file could be downloaded at the same time, but you're essentially downloading it twice, causing double the traffic in total. 🤷♂️
@thalieht commented on GitHub (Sep 13, 2020):
Duplicate of #127
@Det87 commented on GitHub (Sep 20, 2020):
What's so complicated about this? Why doesn't qBittorrent ask by default, if it finds a file with the exact same name whether to (1) recheck it, (2) leave it alone, and do a standard "download (1).exe" or (3) overwrite it?
Okay, sir, but what does this matter? qBittorrent is not intelligent enough to do what file managers do when selecting a file and pressing
Ctrl+CandCtrl+V, or what browsers do when redownloading the same file?@ned-martin commented on GitHub (Sep 20, 2020):
I agree - nothing is complicated about this. It should have been fixed years ago.
However it is a bit more complicated than that, because lots of qBittorrent use is semi-automated (I don't want to wake up in the morning and find that nothing downloaded for the last 12 hours because it was sitting there waiting for me to click on a prompt, or that some third-party script or service has failed to work) these choices need to have a way of picking a default option - in my mind the simplest way to do this is to have the prompt auto-select a default option after x seconds, and have options within settings to select the default option (default: rename), whether to: prompt until user input, not prompt and do the default, or prompt but auto-select the default after X seconds (default).
In my opinion, during normal use, nothing should ever overwrite anything, ever, under any circumstance (and it should always put multiple files in a folder, and then rename the folder upon conflict (folder (1), folder (2), etc) rather than renaming the files themselves, to prevent having a situation where only some files are renamed), so I don't see the need for a prompt - just auto rename by default. I believe this is inline with how most people would expect most programs to work. I think a special case should then be created (i.e. a new menu option) to handle the situation where a user intentionally wants to connect existing files that don't have a torrent associated, to a new torrent (i.e. recheck it) - but it seems that other people must use qBittorrent differently to me so the above auto-prompt method would provide greater flexibility for more users.
The existing hash check when re-adding an identical torrent that's already been added would remain the same.
Other ways of handling this (which in my opinion are worse) could be to queue the prompts in some manner, so they don't block other downloads and can be dealt with later, or auto-handle conflicts (i.e. rename or store in a special conflicts sub-folder using the hash) but keep a record of them (along with the original name info etc.) and queue those so they could be resolved later, or even the super simple fix - do everything as normal but instead of ever overwriting anything, put anything that conflicts into its own unique hash folder and log a warning for manual intervention later.
Keep in mind that you don't always know what the file/folder names are when you start a torrent - for example, when adding a magnet link without any seeds.
Note also that if moving to a new location upon completion is enabled the final move location would need to be factored into the checks when the download initially starts, which gets more complicated if the user changes the eventual save location (such as changing the category) halfway through downloading...
Remember that the user may not know what the filenames are so they might not have meant to do that. I often download magnet links that don't immediately find any seeds, so I have no idea what their filename/s are, and I add multiple magnet links for similar items, as that increases the chances of one of them finding seeds. Then when I am asleep, they may find some seeds, and only then do they get their filenames. If two then happen to have the same file or folder name (which is common), then they will overwrite each other.
@Det87 commented on GitHub (Oct 3, 2020):
K, but I feel like your posts are too long for ppl to get into this.
@Vukodlak commented on GitHub (Oct 21, 2020):
Would it be possible to add an option to create a folder for each different torrent and to add the first 8 characters of the torrent Hash to the either the start of the end of the folder's name (user preference)?
For example, let's say you want do download a torrent for the movie "Scarlet Street (1945)". It's Hash is "12345678... ", and it has a folder with the movie file and some subtitles in it:
Folder: "Scarlet Street (1945)"
Inside the folder:
"Scarlet Street (1945).mp4"
"Scarlet Street (1945).srt"
qbittorrent could then create either of these two folders names, depending of the user's preference:
"12345678-Scarlet Street (1945)" or
"Scarlet Street (1945)-12345678"
Now, if the original torrent only contains one file, like "Scarlet Street (1945).mkv", with the Hash "ABCDEFGHI...", then it would have to create a folder for this one file inside the download folder (maybe trim down very long file names), like this:
"ABCDEFGHI-Scarlet Street (1945)" or
"Scarlet Street (1945)-ABCDEFGHI"
If a torrent has a very long name, like "I Killed My Lesbian Wife, Hung Her on a Meat hook, and Now I Have A Three Picture Deal at Disney (1993) H264 AAC.mp4", the longer the name the more likely is is to create an error, so I imagine it should be trimmed down to something like the first 8 Hash characters and the first 30 characters of it's name, like this:
"ABCDEFGHI-I Killed My Lesbian Wife, Hu" or
"I Killed My Lesbian Wife, Hu-ABCDEFGHI"
I believe utorrent or qbittorrent used to work like this, adding part to the hash to the torrents folder names, for a while. It would be good to have something like that for people who download multiple torrents and have trouble with this.
@Det87 commented on GitHub (Oct 21, 2020):
Cool torrent names.
@pxssy commented on GitHub (Oct 22, 2020):
Is this fixed yet? Why was this not implemented in v4.3.0?
Its a rather easy fix isn't it? It already does a checksum first when an existing file already exists with the same name. i mean if its the same name and 0% identical, you'd think its a pretty good sign its not the the same file in the torrent and should at the very least throw an error/inform the user.
@glassez commented on GitHub (Oct 23, 2020):
Well, I have analyzed this Issue. Here are my current findings.
We are not talking about the presence of files on disk, but only about the presence of files with the same names in the processed torrents. So when another torrent is added, we need to check its file names to match the file names of existing torrents. If the name of a file is already occupied, then we must modify it in some way to exclude a match. The only question here is whether we should modify the names of all the torrent files, or just the name of the root folder.
Applying such a processing is not a problem if the torrent you are adding has metadata immediately. But what should we do if you add a torrent using a magnet link?
We have to apply this processing between the moment when the metadata was uploaded and the one when libtorrent started downloading files. Here again there is a problem of lack of low-level interaction with libtorrent. After all, when we get
metadata_received_alert, we are already "out of business".We can, of course, set
stop_when_readyflag on the torrent being added so that it stops before downloading and we can process the files, but unfortunately, it is triggered before downloading metadata. The only thing that comes to my mind is to track the transition intodownloading_metadatastate using thetorrent_plugin, and set thestop_when_readyflag at this point. But it seems that we will have to runforce_recheckafter renaming files, because the torrent will already be checked before it is paused bystop_when_readyflag.Of course all this stuff can be implemented in libtorrent itself (but we can't hope for that).
@arvidn, are there any thoughts?
@conkerts commented on GitHub (Oct 28, 2020):
Woa, I had some fear that there would be some issue with the library, but didn't expect it to be that complex.
At this point, I'd even say it's not really qBittorrent's fault/bug, if the libtorrent API lacks those features.
Should we as users file this upstream issue to libtorrent ? (Or is there already one? )
It sounds like this is the most sensible thing to do.
Also then other clients based on libtorrent should have exactly the same problem ?
@glassez commented on GitHub (Oct 30, 2020):
They shouldn't necessarily have the same problem if they implement this feature in application level. I'm thinking of doing it the same way when the prerequisite core changes (#13123) are merged to allow performing some service actions behind the scenes.
@glassez commented on GitHub (Dec 5, 2020):
Well, when it is implemented, what exactly should it handle, each individual file or the top-level items only?
@article10 commented on GitHub (Dec 21, 2020):
"The only question here is whether we should modify the names of all the torrent files, or just the name of the root folder."
The name of the root folder of the newly added torrent should be changed (add 'foldername (1)', (2), etc.) if there is a collision.
In principle, the names of files inside a torrent should not be changed (that would break multi-part rar files etc., file references in .bat files, or even in .exe files to dlls etc).
@melicrom commented on GitHub (Feb 1, 2021):
Two parts: My experience and stream-of-conciousness about a fix (which I don't think is a simple as it may look at first glance)
My experience
This has happened to a couple times, a few years apart with certainly different versions. Long enough that I forgot it even happened until the ahh-ha moment. One was today on 4.3.3.
In all cases it was a lazy torrent creator who made multiple torrents of some type of "sequential" or "similar in general name, but different in substance" data while leaving the base torrent Name the same. The torrents were likely created one-after-another and it took me a long time to abstractly understand what was going on just by looking at the interface especially because all had low seeds and the overwrite happened slowly. Compounding confusion as others noted is that after an individual file (or parts?) in the torrent has been downloaded it isn't checked again unless you force it, so completed files and in-progress files were actually restarted or overwritten even though they were still listed 100% and Completed in a different torrent. Also, individual files would sometimes say 0% after re-check, but they would be perfectly openable, readable and not corrupt (downloaded from a different torrent); other times they would not be openable.
Empirically encountered cases where filenames (again, from lazy people) were the same.
Maybe the real problem in my observed cases is that trackers allow you to have a stupid torrent 'name' that doesn't match the title on the tracker or prevent placement of multiple torrents on the same tracker with the same top level 'name'. This would have eliminated all the cases I experienced.
Adding the "Downloaded" column confirmed my suspicions. I re-downloaded parts of all of these at least three times. Because of low seeds, uploader bandwidth, and NAT, completion time is days to weeks with huge variance between completion of each torrent. All 5, same name.

Fix?
Disclaimer: I know absolutely nothing about the code and it sounds like some of these may not be possible (yet or ever). Suppositions and possibly flawed logic based on my current understanding of how torrents work. Just trying to help.
Agree this is generally accepted action from user perspective for the 'same name' situation (ex: same file downloaded multiple times in a browser) But I think there are some complicated corner cases. Sometimes the person may just be reseeding or it may be a partial or less common, a partially corrupted torrent where some files no longer open?
Consider these:
Multi-torrent-add (software has more info to make decisions)
maybe append first or last part of the torrent hash as suggested by @Vukodlak previously ( could just be 4 or 5 digits) to second folder name after determining there would be file collisions between present torrents and just-downloaded torrent by hash comparison? Would also allow some checking on add to see if a folder matches with hash appended and with acceptably low collision considering this is already a corner case? Does this also cover the centos case @ned-martin mentioned above or could those have the same hash -> could be the same exact torrent and a user changes just the tracker? Also, could be intentional where you want to use the same underlying data as mentioned in thread. (not a case I ever had, but not sure if it's common)
File(s) exist (but don't belong to another torrent, not enough info to make decision)
maybe re-check all files by hash and warn user that files exist with the same name, but hash data doesn't match the torrent and prompt to overwrite with a "This data is not from this torrent, are you sure you want to do this..." ? Maybe tell the user how much doesn't match? I don't know the mechanics of this but if there are a large number of file-pieces present and populated with data that don't match the hash, this is the only likely possible reason besides large-scale-computer-destroying-corruption?"
Also thank you @glassez for your work on the project, while doing search-dilligence to find this issue I saw your name a lot.
@sakkamade commented on GitHub (Jun 7, 2021):
Duplicate of https://github.com/qbittorrent/qBittorrent/issues/127.
@milahu commented on GitHub (Aug 9, 2021):
"should" ... hehe
a simple solution:
download every torrent to
(downloads)/(name)/(hash)/putting the hash second allows to find the torrent by its name
no need for renaming
@lonecrane commented on GitHub (Oct 18, 2021):
To me, the problem is different, doesn't involving simultaneously downloading. It is that the so-called "cross-seeding" senario go wrong for some particular torrents. Actually I add the torrents one by one, ensuring the previously added ones have been finished and STOPPED. However, they will still reports "file missing" after restart qbittorrent.
@FranciscoPombal commented on GitHub (Oct 18, 2021):
@lonecrane
Currently, cross-seeding only works correctly if the files are exactly the same or if torrent A only has additional files over torrent B, not any modified file.
The underlying cause of the issue you describe is the same as this one, whether you start the download of conflicting torrents simultaneously or one after the other.
@lonecrane commented on GitHub (Oct 18, 2021):
@FranciscoPombal
No, before I restart qbittorrent, every torrent has been rechecked to 100% separately. Actually they point to the same files. However, they will report "file missing" after restart qbittorrent. The procedures are listed below, please note that all the files have been downloaded:
@FranciscoPombal commented on GitHub (Oct 19, 2021):
@lonecrane
What you describe still has to do with this issue.
In the process of completing the second torrent, qBittorrent clobbers (some of) the files from step 1, but does not update the completion state of the first torrent to reflect that, because each torrent, their completion state, and their save location is treated independently per session.
All that qBittorrent knows is that each torrent was completed at some point in the past. The checks it makes when resuming are not enough to detect the clobbering that took place in step 2. The result of this is that some of the pieces you seed will be garbage (those that were clobbered in each torrent).
When you restart, qBittorrent reads the .fastresume files and notices the file corruption in both torrents. Notice this corruption is unpredictable. In fact, I think it's even possible for only one torrent's files to be overwritten, but most often both will be affected in this case.
@lonecrane commented on GitHub (Oct 19, 2021):
@FranciscoPombal
Thanks. Now I agree that the problem I want to report is within the issue here. But I still don't quite catch what does "clobbers" mean. Could you explain this action?
By the way, I would like to report some strange phenomena when do cross-seeding (before I restart qbittorrent). qbittorrent keeps append '.!qb' to some files even the related option has been unchecked.
@EdwinKM commented on GitHub (Nov 28, 2022):
Can someone explain why the solution of @milahu seems/is so difficult to implement?
(incomplete)/(name)/(hash)/
Or just create a unique subfolder for each torrent. Based on the torrents hash or a incremental counter
(incomplete)/(counter)
When finished it can move the download folder to the "finished" folder with the "correct" name (based on the torrents filename or content or user input). In case of a collision at this point is easily fixable with appending a incremental number.
The migration for current users with ongoing downloads can cause some challenges. The program has to be backwards compatible.
@milahu commented on GitHub (Nov 28, 2022):
implementation is easy, but this bug is so rare that only few people care.
in most cases you can just re-download the lost files
the only challenge is to make this backward-compatible with the old filepath schema.
a simple solution is to introduce a new set of download folders (temp + done),
so you have, for example
"cas" as in content-addressed store
the cas paths must be different from the non-cas paths
now the user can either/or
@EdwinKM commented on GitHub (Dec 5, 2022):
@milahu , i am searching for a workaround. Both transmission and qbittorrent have the same issues.
Lets say my "complete downloads" path is "/host/complete". This is set automatically for each new download.
If i add 2 torrents and i see that they will conflict. I change for the conflicting item to "/host/complete/download_01". Now the data is moved from "incomplete" to "complete".
For now i solved this by created a third folder "incomplete_custom". If i need to rename afterwards i change "/host/complete" to "/host/incomplete_custom/myfoldername". It should by default create a subfolder per torrent (with a uniq name) and it should never move the file (if the user is changing the destination) to the new location if the download is not finished.
Small update: Exactly my "do not move" is implemented in qbittorrent. This makes it worse for current qbittorent. I now can not download files without interference and rewriting each other data. With Transmission i can rename the destination so a (half baked) workaround is actually possible
Created a warning on the truenas forum users.
@Scripter17 commented on GitHub (Jun 12, 2023):
I just put a $20 bounty on this issue using the bountysource link OP posted. Not sure if a bot's going to show up to say that or not, so I'm saying it myself
I run into this issue very often in exhentai (porn site) torrents, especially for galleries that are all the works of a certain artist. At the bare minimum I'd like qbittorrent to mimic the way Windows handles name conflicts (
Folder,Folder (1),Folder (2), etc. andFile.zip,File (1).zip,File (2).zipfor torrents that are just one file). It may be worthwhile to allow the user to define the exact formatting of root folder/file names (using a syntax like strftime or the "run external program when added/finished" feature) so that they can implement other solutions (like using the torrent ID) more easily@milahu commented on GitHub (Aug 25, 2023):
this problem becomes even more interesting with v2 torrents
with v1 torrents, we have only one hash per torrent
so its either one hash per directory, or one hash per file
with v2 torrents, we have one hash per file
similar to other content-addressed stores like IPFS, git, perkeep, ...
so ideally, we would have a machine-level content-addressed store
where all files are stored by their hash, for example
v2 torrent hashes are sha256 hashes, so 32 bytes for the raw hash, and 64 bytes in base16 = hexdigest (or 43 bytes in base64, or 52 bytes in base32) (base64 is bad for file paths because its case-sensitive)
partitioning hashes by their first two characters (
/12/) makes directory-listing cheaperthen the actual files in the download folder are hardlinked (or symlinked) to that store.
collisions of file paths would still have to be handled somehow,
but now its cheaper to copy and rename a directory, because it contains only links to the store
other CAS filesystems
ideally, this would integrate with other CAS filesystems (content-addressed stores)
so the store paths could be
or
this reminds me of my rant in nix sha256 is bug not feature. solution: a global /cas filesystem
https://en.wikipedia.org/wiki/Content-addressable_storage
CAS filesystem of git (hash is sha1dc of header + content)
CAS filesystem of "git2" (hash is sha256 of header + content)
https://stackoverflow.com/questions/60087759/git-is-moving-to-new-hashing-algorithm-sha-256-but-why-git-community-settled-on
https://github.com/go-gitea/gitea/issues/13794
create git repo with
git init --object-format=sha256casync uses sha256, but its focus is on network transfers, not filesystem
https://en.wikipedia.org/wiki/Casync
CAS filesystem of perkeep (sha224)
https://github.com/perkeep/perkeep/issues/625
CAS filesystem of bazel https://github.com/bazelbuild/bazel-buildfarm/issues/568
CAS filesystem of IPFS (but IPFS is just a slow version of bittorrent)
block-centric (not file-centric), base32 hashes with non-standard alphabet (?)
(block-centric feels wrong, because this gives deduplication only when appending to files, but fails when prepending or inserting to files... a block-level deduplication should be handled by the actual filesytem like btrfs or XFS or ZFS)
https://docs.ipfs.tech/concepts/content-addressing/#cids-are-not-file-hashes
https://cid.ipfs.tech/#QmY7Yh4UquoXHLPFo2XbhXkhBvFoPwmQUSa92pxnxjQuPU
https://stackoverflow.com/questions/1903416/do-any-common-os-file-systems-use-hashes-to-avoid-storing-the-same-content-data
https://en.wikipedia.org/wiki/Single-instance_storage - Single-instance storage (SIS) is a system's ability to take multiple copies of content and replace them by a single shared copy. It is a means to eliminate data duplication and to increase efficiency.
https://en.wikipedia.org/wiki/Data_deduplication
https://github.com/google/casfs - Content-addressable storage, implemented over pyfilesystem2 (python)
default sharding: depth=2 width=2
https://github.com/131/casfs - local content-addressable file system (javascript)
https://github.com/andyleap/casfs - ? (go)
TODO: write a proof-of-concept bittorrent client in python, which can connect to multiple CAS filesystems (on multiple hard drives)
@makeasnek commented on GitHub (Aug 25, 2023):
Please do not use bountysource. Many devs have had trouble getting paid there. You can check out this lemmy community as an alternative https://lemmy.ml/c/bugbounties
For statements from devs who have been unable to cash out from bountysource see:
https://github.com/bountysource/core/issues
@milahu commented on GitHub (Aug 25, 2023):
or just pay the devs directly, instead of replacing one scam with another scam
@makeasnek commented on GitHub (Aug 25, 2023):
That's literally what the linked site is. It's a place for OSS projects to post bounties. The projects pay the devs who solve the bugs in those bounties. Like BountySource, but without the middleman. I started this lemmy community because I previously posted a lot of bounties on BountySource for a non-profit I worked for, until BountySource decided to just stop responding to withdrawal requests.
@Scripter17 commented on GitHub (Aug 25, 2023):
If this issue gets resolved and bountysource doesn't pay out, I'll just give the money directly via paypal or something
@arvidn commented on GitHub (Aug 26, 2023):
it would be possible to add a flag to libtorrent making it an error if any file already exists. I'm hesitant to do so though, because it's a pretty simple check for a client to do.
However, a more comprehensive check would be to see if a new torrent has any file that clashes with any existing torrent's potential file. That would require a more interesting data structure to be efficient. But it could also be made on the client side.
@milahu commented on GitHub (Aug 26, 2023):
that would be a "merge torrents by default" behavior, which can be unwanted
@Scripter17 commented on GitHub (Aug 26, 2023):
"Merge by default" can also be very wanted, as is often the case when people on exhentai upload torrents that are actual well-structured folders that all use the same format instead of lazily slapped together zip files. When there's updates you just download the new files and both torrents keep working
It's very important that the user (me) is able to choose that behaviour
@arvidn commented on GitHub (Aug 26, 2023):
"merge by default" is the current (default) behavior. My reading of this ticket is that users would like "warn before overwriting" being the default, which is essentially the oposite
@milahu commented on GitHub (Aug 26, 2023):
early early draft: add-option-custom-download-path-format
this would be simpler than "warn before overwriting" because it would "split by default" to avoid path collisions when
(hash)is part of the format string, so the download process is not blocked, and we dont need more temporary files for "unnamed" filesone downside of my fix is that merging is more complex, because the file paths are more complex, but ideally, we let the user choose an external program (python script, or whatever) to organize the files, for example with symlinks, or by telling qbittorrent to move the files, or my moving files and telling qbittorrent to use the new location
@NSQY commented on GitHub (Aug 27, 2023):
Yes, the client should never overwrite files by default. I have lost irrecoverable data due to accidentally overwriting existing files, and ultimately ended up killing the swarm as I was the last seeder.
@milahu commented on GitHub (Sep 4, 2023):
done: https://github.com/milahu/cas_torrent
this stores files by the sha256 file hash = sha256 store
bittorrent v2 root hashes are symlinks to the sha256 store = bt2r store
human-readable files are symlinks to the sha256 store = bt2 store
human-readable torrent names are directories with symlinks to files in the bt2 store = las store
directories are merged by default.
files are renamed only when they have different content.
using symlinks allows to "unmerge" directories with readlink.
las symlinks target the bt2 store, so its visible "to what torrent does this file belong?"
(this would not be visible when symlinking directly to the sha256 store)
@excelgit commented on GitHub (Nov 30, 2023):
Using symlinks or hardlinks would introduce another method to lose data. For computers those things are easy to handle (although data loss will appear if humans implement or interprete it wrong, or can't see the method is used, or a filemanager doesn't handle it correctly). For humans they are not. They will introduce new methods to lose data.
I think qBit should do this:
#1: The root name of the download (either file or folder) should be renamed automatically if a new torrent/magnet sees that the root name is already in the incompletes folder. It could be renamed by adding the last 3 characters of the hash code. Not much more than 3; we don't want very long pathnames, which gives new issues.
#2: When the download is ready, and automoved to the completed folder, again, there should be a check if the root name already exists, and if so, it should be - again - autorenamed.
Another method would be: autoinsert the starting time in the root name of new downloads, preferably only if it finds the root name is already in use.
So: Linux.iso would be:
Linux.iso + Linux.231130_185622(1).iso + Linux.231130_185622(2).iso if you start 3 torrents with the same name the same time.
Or
Linux.iso + Linux.231130_185622.iso + Linux.231130_185658.iso if not started at the same time.
Another method would be: if the root name already exists: ask the user if he wants to insert (a part of) the hash or timestamp in the root name.
Best would be: offer an option, so that the user can select 1 of the 4 methods: the current (wrong) method, or 1 or the 3 above. Always let the user have more control. That is always better.
Anyway: the default should always be that a new torrent/magnet never overwrites anything, either in the incompletes or in the completed folder. That's about always rule#1, for any software!
BTW: not only the files themselves should be autorenamed, also the name of what qBittorrent says it is downloading in the GUI. Otherwise, again, there will be made mistakes and data loss will appear.
This reminds me of another problem: the name of the torrent/magnet is often other than the root name of the file/folder of what it is downloading. But that is an issue for another thread.
@milahu commented on GitHub (Jun 9, 2024):
workaround in qbittorrent-move-to-cas.py
move all finished torrents to
~/cas/btih/{btih}@dftf-stu commented on GitHub (Dec 4, 2024):
Yes, sometimes users might want to download something rare or less-common, like an obscure album.
You might add a number of different magnet 🧲 links, all of which share the same name. For example:
🕑The Examples - Example Album (1983) 320kbps MP3 ... gathering metadata
🕑The Examples - Example Album (1983) 320kbps MP3 ... gathering metadata
🕑The Examples - Example Album (1983) FLAC ... gathering metadata
🕑The Examples - Example Album (1983) FLAC ... gathering metadata
Two of them might then start downloading at the same time, into the same destination folder:
🔽The Examples - Example Album (1983) 320kbps MP3 ... downloading
🔽The Examples - Example Album (1983) 320kbps MP3 ... downloading
🕑The Examples - Example Album (1983) FLAC ... gathering metadata
🕑The Examples - Example Album (1983) FLAC ... gathering metadata
And you end-up with two torrents endlessly overwriting each other's files, then when one completes
and it does a "verify", you then get an error as the file hashes don't match.
A simple way to avoid this might be to have an option "add part of torrent hash to end of folder name",
so you'd get something like:
"The Examples - Example Album (1983) 320kbps MP3 c71d8a"
"The Examples - Example Album (1983) 320kbps MP3 51cad2"
It's amazing how any torrent app I've used doesn't do this: qBittorrent doesn't; Transmission doesn't;
Flud (Android client) also doesn't. 🤷🏻♂️
@graphixillusion commented on GitHub (Feb 3, 2025):
I don't know if this is related but i'm having corrupted files very often caused by this issue: the funny thing is that if i do a force check in a torrent which has some corrupted files in it, the test result will be ok. I think that a torrent which has corrupted files inside it shoudn't be flagged as ok.
@vertigo220 commented on GitHub (Feb 11, 2025):
Just ran into this issue when I had two different torrents downloading to the same folder and with the same filename (actually the same file, but two separate torrents/hashes) and qb wasn't intelligent enough to see this and say or do anything about it, and instead seems to have tried to just download and write both torrents to the same file. This resulted in a file almost twice the size as either one (though not quite 2x) that was corrupt. I had to stop one and redownload one to get the uncorrupted file. I was amazed and disappointed to find there's no built in protection against something like this, given it's basically file management 101 and it's crazy an issue like this would exist in any software these days, much less one like this. And to make matters worse, this issue, which results in data corruption/loss, has been open almost 5 years now...
@xavier2k6 commented on GitHub (Feb 11, 2025):
@vertigo220 Can you provide those 2 torrents?
@ned-martin commented on GitHub (Feb 11, 2025):
Running version 5.0.2, the problem still isn't fixed?
I just checked on my torrents and found many had overwritten each other, and others were reporting "missing files" because one of the torrents had completed and moved the files to the folder I have set for completed torrents to move to, while the other torrents that were saving to the same filenames now can't find their files...
I have just spent some time manually renaming the folders within torrents that I am downloading by selecting each one in the "Content" pane at the bottom, hitting F2 to rename the folder, and appending sequential numbers to the ends of them to prevent them from overwriting each other. This seems really, really absurd. Nothing else I can think of does this. Imagine if your browser just saved all its downloaded files over the top of each other - that'd be insane right? Yeah well... that's exactly what qBittorent does and somehow people are finding this confusing or not understanding just how stupid it is?
@ned-martin commented on GitHub (Feb 11, 2025):
Thanks for going to the effort to explain this so clearly. Apparently lots of people don't understand how this is an issue?
While that would work, it seems overly complex and abnormal. This situation is handled in the same way by basically every other software in the world that saves files - they append an incrementing number to the end of the folder/file. However, it seems to me like it would be trivial to do both - add an option in settings for people to pick what they want: "On folder/filename collision: Append number (1, 2, etc) | Append unique hash (51cad2)". Ideally you would only append something to duplicates - no point renaming everything if you don't need to. Obviously lots of people don't have this issue so no point changing their workflow.
@graphixillusion commented on GitHub (Feb 11, 2025):
Another common scenario is when the content of the torrent has roms inside (like MAME for example). It's very common that the new roms are added inside zip files which has the same name of the previous version, but the content is different. So in this case the corruption occurrs very very often, sadly...
@jargon4220 commented on GitHub (Aug 18, 2025):
Conflicts like this often occur with torrents that use certain naming conventions.
If different torrents try to handle files with the same name, the later one must fail.
That is normal behavior for any sane file handling system not just torrents.
It is surprising that widely used software silently overwrites data and causes corruption.
Because of this manual checking and fixing is always needed.
@nick-gh567 commented on GitHub (Sep 17, 2025):
Version 5.1.0, Windows 10, the issue is still present.
@qBittUser commented on GitHub (Nov 17, 2025):
Latest is v5.1.3 and if default was changed or a new option/setting or error/warning would've been added or automatic rename in a new release then ideally it would be mentioned in the news changelog page.
If a pull request to even partially resolve this where made, then most likely this issue report would have it mentioned.
Long ago qBit didn't allow silently overwriting, but it was quickly reverted back and no manual control was ever added to choose default behavior.
With certain scenarios overwriting is default behavior of libtorrent:
@zwei7 commented on GitHub (Jan 25, 2026):
Version 5.1.0, Windows 10
Got the same issue,
This happens a lot with RSS feeds and torrent sites that allow users to name their torrent.
2+ users may name their torrent offering with the same name (1 file compressed into 1 rar file).
A few days later one of the 2+ torrent will get removed on the site, which is why you set up a RSS feed in the first place.
But I know I got both files automatically downloaded, but oh wait, qbittorrent overwrote one of the .rar files.
Now I don't have any way to get the old one since it was deleted from the torrent site.