#7990 - Torrent rechecks to 100% despite appended data - files are not truncated - starred/qBittorrent

deekerman commented

2026-02-21 19:16:55 -05:00

Owner

Originally created by @Rne163 on GitHub (Oct 29, 2018).

Please provide the following information

qBittorrent version and Operating System

qBittorrent v4.1.1 Windows 10 64-bit (10.0.1xxxx)

What is the problem

When a small file (around 1KB in size) is modified, the torrent rechecks to 100%, with the modified file not being replaced. Deleting the file correctly downloads the original file again

What is the expected behavior

The torrent should not recheck to 100% and attempt to download and replace the modified file.

Steps to reproduce

Download a torrent containing a small text file or otherwise small file.
Add some random characters to it, changing its checksum.
Force recheck the torrent.

Extra info(if any)

I specifically noticed this issue on .sfv files. I cannot reproduce it on larger files.

Originally created by @Rne163 on GitHub (Oct 29, 2018). **Please provide the following information** ### qBittorrent version and Operating System qBittorrent v4.1.1 Windows 10 64-bit (10.0.1xxxx) ### What is the problem When a small file (around 1KB in size) is modified, the torrent rechecks to 100%, with the modified file not being replaced. Deleting the file correctly downloads the original file again ### What is the expected behavior The torrent should not recheck to 100% and attempt to download and replace the modified file. ### Steps to reproduce 1. Download a torrent containing a small text file or otherwise small file. 2. Add some random characters to it, changing its checksum. 3. Force recheck the torrent. ### Extra info(if any) I specifically noticed this issue on .sfv files. I cannot reproduce it on larger files.

deekerman added the

Core

Libtorrent

Discussion

labels

2026-02-21 19:16:55 -05:00

deekerman commented

2026-02-21 19:16:57 -05:00

Author

Owner

@erasmux commented on GitHub (Oct 29, 2018):

Tested changing one letter in a small nfo file (78 bytes) and v4.1.3 correctly rechecks to 99% complete with the said file at 0%. Restoring the original file contents and doing another recheck restores 100% complete.

Check with the latest version v4.1.3, possibly this is an issue that has already been resolved.

EDIT: adding a few characters at the end of the file does exhibit the behavior you describe. I am not sure if thats a bug or a feature (i.e. in my case the first 78 bytes of the file are indeed correct, and I certainly don't want qbittorrent to truncate files on recheck like other clients do - this is one of the things I like about qbittorrent, you can recheck without fear that i'll will damage files).

@erasmux commented on GitHub (Oct 29, 2018): Tested changing one letter in a small nfo file (78 bytes) and v4.1.3 correctly rechecks to 99% complete with the said file at 0%. Restoring the original file contents and doing another recheck restores 100% complete. Check with the latest version v4.1.3, possibly this is an issue that has already been resolved. EDIT: adding a few characters at the end of the file does exhibit the behavior you describe. I am not sure if thats a bug or a feature (i.e. in my case the first 78 bytes of the file are indeed correct, and I certainly don't want qbittorrent to truncate files on recheck like other clients do - this is one of the things I like about qbittorrent, you can recheck without fear that i'll will damage files).

deekerman commented

2026-02-21 19:16:59 -05:00

Author

Owner

@sandersaares commented on GitHub (Nov 8, 2018):

I certainly don't want qbittorrent to truncate files on recheck like other clients do - this is one of the things I like about qbittorrent, you can recheck without fear that i'll will damage files).

What is the scenario where you do not want qB to fix the file (e.g. by truncation)?

@sandersaares commented on GitHub (Nov 8, 2018): > I certainly don't want qbittorrent to truncate files on recheck like other clients do - this is one of the things I like about qbittorrent, you can recheck without fear that i'll will damage files). What is the scenario where you do not want qB to fix the file (e.g. by truncation)?

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Nov 8, 2018):

EDIT: adding a few characters at the end of the file does exhibit the behavior you describe.

@Chocobo1 @glassez
This sounds like a pretty serious bug in the rechecking code. Can any of you take a look at it? ~~Is it part of Qbittorrent or libtorrent code?~~

EDIT: after taking a quick look at qBittorrent's code, this seems to be an issue with libtorrent.

@arvidn Can you take a look at this? Rechecking should fail for a file if that file has extra bytes appended to it. However, I do not think it should fail for a directory if there are extra files in it, as it would break cross-seeding functionality. Perhaps this last behavior could depend on a "strict recheck" option.

@FranciscoPombal commented on GitHub (Nov 8, 2018): > EDIT: adding a few characters at the end of the file does exhibit the behavior you describe. @Chocobo1 @glassez This sounds like a pretty serious bug in the rechecking code. Can any of you take a look at it? ~~Is it part of Qbittorrent or libtorrent code?~~ EDIT: after taking a quick look at qBittorrent's code, this seems to be an issue with libtorrent. @arvidn Can you take a look at this? Rechecking should fail for a file if that file has extra bytes appended to it. However, I do not think it should fail for a directory if there are extra files in it, as it would break cross-seeding functionality. Perhaps this last behavior could depend on a "strict recheck" option.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Nov 8, 2018):

I see three possible behaviors:

ignore garbage at the end of the file
truncate files to the size according to the .torrent file
fail recheck and prevent the torrent from downloading/seeding

libtorrent used to implement (2), but people were experiencing problems when they accidentally started another torrent downloading to the same filename, causing loss of data. I think there should be a thread either on qbt or deluge to dig up on this issue. This prompted a change in libtorrent to implement (1). @FranciscoPombal I assume you're suggesting (3).

In your opinion, how should the rechecking fail? should some pieces be missing (even though they're not)? Should all pieces be missing? If not, should the existing pieces be able to be seeded? Perhaps it should put the torrent in upload-only mode?
Or should there be some new kind of failure that just blocks the torrent from seeding and downloading until the user, manually, truncates the files and start the checking over again?

@arvidn commented on GitHub (Nov 8, 2018): I see three possible behaviors: 1. ignore garbage at the end of the file 2. truncate files to the size according to the .torrent file 3. fail recheck and prevent the torrent from downloading/seeding libtorrent used to implement (2), but people were experiencing problems when they accidentally started another torrent downloading to the same filename, causing loss of data. I think there should be a thread either on qbt or deluge to dig up on this issue. This prompted a change in libtorrent to implement (1). @FranciscoPombal I assume you're suggesting (3). In your opinion, *how* should the rechecking fail? should some pieces be missing (even though they're not)? Should *all* pieces be missing? If not, should the existing pieces be able to be seeded? Perhaps it should put the torrent in upload-only mode? Or should there be some new kind of failure that just blocks the torrent from seeding and downloading until the user, manually, truncates the files and start the checking over again?

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@erasmux commented on GitHub (Nov 8, 2018):

libtorrent used to implement (2), but people were experiencing problems when they accidentally started another torrent downloading to the same filename, causing loss of data.

This is exactly what I was talking about. I had this behavior back when I was using utorrent and it was really annoying when a simple typo or selecting the wrong destination folder and then running a supposedly non-destructive recheck on a paused torrent would truncate the file, resulting in loss of data.

In your opinion, how should the rechecking fail?

Is there a way that the recheck could mark it as 100% complete and yet leave the file in some incomplete error state (or some other error state)? Then when we actually start the torrent truncating the file is not only acceptable but desirable.

Another "workaround" I can think of: Mark the last piece of the file as missing if the file on the disk is bigger. Conceptually you can think of it as the file including a EOF marker and that marker is a mismatch.

The important part is that the recheck alone does not alter the existing file and only when it is started can the file can be truncated (if this ends up redownloading one piece of the torrent this seems like a decent compromise to me).

@erasmux commented on GitHub (Nov 8, 2018): > libtorrent used to implement (2), but people were experiencing problems when they accidentally started another torrent downloading to the same filename, causing loss of data. This is exactly what I was talking about. I had this behavior back when I was using utorrent and it was really annoying when a simple typo or selecting the wrong destination folder and then running a supposedly non-destructive recheck on a paused torrent would truncate the file, resulting in loss of data. > In your opinion, how should the rechecking fail? Is there a way that the recheck could mark it as 100% complete and yet leave the file in some incomplete error state (or some other error state)? Then when we actually start the torrent truncating the file is not only acceptable but desirable. Another "workaround" I can think of: Mark the last piece of the file as missing if the file on the disk is bigger. Conceptually you can think of it as the file including a EOF marker and that marker is a mismatch. The important part is that the recheck alone does not alter the existing file and only when it is started can the file can be truncated (if this ends up redownloading one piece of the torrent this seems like a decent compromise to me).

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Nov 8, 2018):

@arvidn

@FranciscoPombal I assume you're suggesting (3).

I would actually prefer approach (1).

people were experiencing problems when they accidentally started another torrent downloading to the same filename

In that case, shouldn't the program warn the user "Hey, if you save to this directory, you will be overwriting existing files/folders, are you sure you want to proceed?"(*) instead of silently overwriting/appending/whaterver? Whatever the decision, since it is silent, it would probably not be someone would expect to happen.

Besides, when would people legitimately want to download another torrent to some existing filename? I think this is a very very niche use case that should not be expected by default.
Even in the cross-seeding scenario, what happens is that two torrents have similar folders, but one has extra files. So you just symlink one to the other, add both torrents and you're good to go.

If users really want data appended, they should have to say so explicitly, in my opinion.

Note on (*): libtorrent/qBittorrent would probably have to distinguish between two cases:
1 - New torrent appends data to existing files from another torrent.
2 - New torrent adds files in existing directory belonging to another torrent.

@FranciscoPombal commented on GitHub (Nov 8, 2018): @arvidn > @FranciscoPombal I assume you're suggesting (3). I would actually prefer approach (1). > people were experiencing problems when they accidentally started another torrent downloading to the same filename In that case, shouldn't the program warn the user "Hey, if you save to this directory, you will be overwriting existing files/folders, are you sure you want to proceed?"(*) instead of _silently_ overwriting/appending/whaterver? Whatever the decision, since it is silent, it would probably not be someone would expect to happen. Besides, when would people legitimately want to download another torrent to some existing filename? I think this is a very very niche use case that should not be expected by default. Even in the cross-seeding scenario, what happens is that two torrents have similar folders, but one has extra files. So you just symlink one to the other, add both torrents and you're good to go. If users really want data appended, they should have to say so explicitly, in my opinion. Note on (*): libtorrent/qBittorrent would probably have to distinguish between two cases: 1 - New torrent appends data to existing files from another torrent. 2 - New torrent adds files in existing directory belonging to another torrent.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@sandersaares commented on GitHub (Nov 8, 2018):

I see a 1:1 relationship between files on disk and files in torrent.

When adding a torrent, conflicts with existing non-torrent files should notify the user that existing content will be overwritten.

When adding a torrent, conflicts with files from other known torrents should reject the add and require the user to specify a new location (or automatically do so by adding a subdirectory, as is done in case of multi-file torrents that are not wrapped in a directory).

When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%.

@sandersaares commented on GitHub (Nov 8, 2018): I see a 1:1 relationship between files on disk and files in torrent. When adding a torrent, conflicts with existing non-torrent files should notify the user that existing content will be overwritten. When adding a torrent, conflicts with files from other known torrents should reject the add and require the user to specify a new location (or automatically do so by adding a subdirectory, as is done in case of multi-file torrents that are not wrapped in a directory). When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@erasmux commented on GitHub (Nov 8, 2018):

Even in the cross-seeding scenario, what happens is that two torrents have similar folders, but one has extra files. So you just symlink one to the other, add both torrents and you're good to go.

I can tell you that I experienced loss of data using utorrent's approach (1) in exactly this scenario. In cross-seeding the file names don't always match, so I rename/link the file to match and rely on the recheck to tell me if the content do actually match (in qB you don't even have to make any change on the disk, you can rename the torrent files in the qB interface).

Why should I live in fear that a recheck on a paused torrent would cause data loss?

What could be acceptable, is if there were an additional popup during the recheck claiming the file sizes don't match and asking the user to approve or deny truncating the files. Personally I would probably prefer other solutions (like the ones I suggested above).

@erasmux commented on GitHub (Nov 8, 2018): > Even in the cross-seeding scenario, what happens is that two torrents have similar folders, but one has extra files. So you just symlink one to the other, add both torrents and you're good to go. I can tell you that I experienced loss of data using utorrent's approach (1) in exactly this scenario. In cross-seeding the file names don't always match, so I rename/link the file to match and rely on the recheck to tell me if the content do actually match (in qB you don't even have to make any change on the disk, you can rename the torrent files in the qB interface). Why should I live in fear that a recheck on a paused torrent would cause data loss? What could be acceptable, is if there were an additional popup during the recheck claiming the file sizes don't match and asking the user to approve or deny truncating the files. Personally I would probably prefer other solutions (like the ones I suggested above).

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Nov 8, 2018):

In that case, shouldn't the program warn the user "Hey, if you save to this directory, you will be overwriting existing files/folders, are you sure you want to proceed?"(*) instead of silently overwriting/appending/whaterver? Whatever the decision, since it is silent, it would probably not be someone would expect to happen.

Well, I can only speak to what libtorrent should do. And I don't think it can help that much in this scenario. Also, keep in mind that the normal way one resumes a torrent is by "overwriting" existing files.
For example, there's no check to make sure that pieces we don't have are all zeroes, so there may be some data there that's overwritten. On the other hand, the normal behavior of downloading a piece that fails the hash check is to just leave it there (not overwrite it by zeroes, as that would not be very efficient).

Besides, when would people legitimately want to download another torrent to some existing filename? I think this is a very very niche use case that should not be expected by default.

I don't think it was legitimate, I think it was accidental.

If users really want data appended, they should have to say so explicitly, in my opinion.

what do you mean by having data appended? libtorrent never appends data.

Note on (*): libtorrent/qBittorrent would probably have to distinguish between two cases:
1 - New torrent appends data to existing files from another torrent.

when I hear "append", I think adding bytes at the end of a file. But you mean adding additional files to a directory?

2 - New torrent adds files in existing directory belonging to another torrent.

The problem of figuring out whether the files torrent A is writing to also belongs to torrent B is not solved in libtorrent. I believe there's a poor-man's check that if B already has a file open and A tries to open it, some error is generated.

@arvidn commented on GitHub (Nov 8, 2018): > In that case, shouldn't the program warn the user "Hey, if you save to this directory, you will be overwriting existing files/folders, are you sure you want to proceed?"(*) instead of silently overwriting/appending/whaterver? Whatever the decision, since it is silent, it would probably not be someone would expect to happen. Well, I can only speak to what libtorrent should do. And I don't think it can help that much in this scenario. Also, keep in mind that the normal way one resumes a torrent is by "overwriting" existing files. For example, there's no check to make sure that pieces we *don't* have are all zeroes, so there may be some data there that's overwritten. On the other hand, the normal behavior of downloading a piece that fails the hash check is to just leave it there (not overwrite it by zeroes, as that would not be very efficient). > Besides, when would people legitimately want to download another torrent to some existing filename? I think this is a very very niche use case that should not be expected by default. I don't think it was legitimate, I think it was accidental. > If users really want data appended, they should have to say so explicitly, in my opinion. what do you mean by having data appended? libtorrent never appends data. > Note on (*): libtorrent/qBittorrent would probably have to distinguish between two cases: > 1 - New torrent appends data to existing files from another torrent. when I hear "append", I think adding bytes at the end of a file. But you mean adding additional files to a directory? > 2 - New torrent adds files in existing directory belonging to another torrent. The problem of figuring out whether the files torrent A is writing to also belongs to torrent B is not solved in libtorrent. I believe there's a poor-man's check that if B already has a file open and A tries to open it, some error is generated.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Nov 8, 2018):

@arvidn

I don't think it was legitimate, I think it was accidental.

Well, more reason then to simply warn the user, with a dialog or something, which is probably done at the application level and not library level.

what do you mean by having data appended? libtorrent never appends data.

If I understood your previous comment correctly, libtorrent would append data if users added a torrent with the same filename as an existing one.

when I hear "append", I think adding bytes at the end of a file. But you mean adding additional files to a directory?

I really mean adding bytes at the end of the file.

I think @sandersaares expressed the idea more clearly:

I see a 1:1 relationship between files on disk and files in torrent.

When adding a torrent, conflicts with existing non-torrent files should notify the user that existing content will be overwritten.

When adding a torrent, conflicts with files from other known torrents should reject the add and require the user to specify a new location (or automatically do so by adding a subdirectory, as is done in case of multi-file torrents that are not wrapped in a directory).

When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%.

So, for example, Suppose you have torrent1 and torrent2.
torrent1 references a file, file1 with the following content: 012345
torrent2 references a file, file1 (same name), with the following content: 012345DEADBEEF

If torrent1 is already added and downloaded, and I try to add torrent2, qBittorrent should warn me that proceeding can result in data loss for torrent1 (this is only related to qBittorrent's code, I think). If I ignore the error and later recheck torrent1, the extra DEADBEEF should get truncated (this is the part related to libtorrent). Of course, this results in data loss with respect to torrent2, but it is the users fault for ignoring this. Rechecking and resuming torrent2 would then trigger the redownload of the missing pieces.

Again, I think this is the key point:

When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%.

There could be any number of reasons for files to get (garbage) data appended to them. Files may be rendered inoperable because of that, and the user would never know why.

More importantly, they would also fail checksums, which would be really confusing for everyone given that libtorrent/qBittorrent said they passed the recheck. This could potentially have security implications.

@FranciscoPombal commented on GitHub (Nov 8, 2018): @arvidn > I don't think it was legitimate, I think it was accidental. Well, more reason then to simply warn the user, with a dialog or something, which is probably done at the application level and not library level. > what do you mean by having data appended? libtorrent never appends data. If I understood your previous comment correctly, libtorrent would append data if users added a torrent with the same filename as an existing one. > when I hear "append", I think adding bytes at the end of a file. But you mean adding additional files to a directory? I really mean adding bytes at the end of the file. I think @sandersaares expressed the idea more clearly: >I see a 1:1 relationship between files on disk and files in torrent. > >When adding a torrent, conflicts with existing non-torrent files should notify the user that existing content will be overwritten. > >When adding a torrent, conflicts with files from other known torrents should reject the add and require the user to specify a new location (or automatically do so by adding a subdirectory, as is done in case of multi-file torrents that are not wrapped in a directory). > >When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%. So, for example, Suppose you have torrent1 and torrent2. torrent1 references a file, file1 with the following content: `012345` torrent2 references a file, file1 (same name), with the following content: `012345DEADBEEF` If torrent1 is already added and downloaded, and I try to add torrent2, qBittorrent should warn me that proceeding can result in data loss for torrent1 (this is only related to qBittorrent's code, I think). If I ignore the error and later recheck torrent1, the extra `DEADBEEF` should get truncated (this is the part related to libtorrent). Of course, this results in data loss with respect to torrent2, but it is the users fault for ignoring this. Rechecking and resuming torrent2 would then trigger the redownload of the missing pieces. Again, I think this is the key point: >When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%. There could be any number of reasons for files to get (garbage) data appended to them. Files may be rendered inoperable because of that, and the user would never know why. __More importantly, they would also fail checksums__, which would be really confusing for everyone given that libtorrent/qBittorrent said they passed the recheck. This could potentially have security implications.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Nov 8, 2018):

what do you mean by having data appended? libtorrent never appends data.

If I understood your previous comment correctly, libtorrent would append data if users added a torrent with the same filename as an existing one.

I think overwrite is a better description. It just won't truncate the existing files, but it may extend it. But if none of the existing bytes in the file match the piece hashes, they will all be replaced.

So, for example, Suppose you have torrent1 and torrent2.
torrent1 references a file, file1 with the following content: 012345
torrent2 references a file, file1 (same name), with the following content: 012345DEADBEEF

If torrent1 is already added and downloaded, and I try to add torrent2, qBittorrent should warn me that proceeding can result in data loss for torrent1 (this is only related to qBittorrent's code, I think). If I ignore the error and later recheck torrent1, the extra DEADBEEF should get truncated (this is the part related to libtorrent).

That's right, except libtorrend won't truncate it. And if the two torrents happen to try to open the files at the same time, there will be an error. libtorrent just doesn't keep an extra data structure for the sole purpose of detecting conflicts on files.

Of course, this results in data loss with respect to torrent2, but it is the users fault for ignoring this. Rechecking and resuming torrent2 would then trigger the redownload of the missing pieces.

from libtorrent's point of view, it would be the client warning about this, not libtorrent. There is a cost associated with keeping this extra data structure, that I don't think is relevant to all users of libtorrent.

@arvidn commented on GitHub (Nov 8, 2018): > > what do you mean by having data appended? libtorrent never appends data. > > If I understood your previous comment correctly, libtorrent would append data if users added a torrent with the same filename as an existing one. I think overwrite is a better description. It just won't truncate the existing files, but it may extend it. But if none of the existing bytes in the file match the piece hashes, they will all be replaced. > So, for example, Suppose you have torrent1 and torrent2. > torrent1 references a file, file1 with the following content: 012345 > torrent2 references a file, file1 (same name), with the following content: 012345DEADBEEF > > If torrent1 is already added and downloaded, and I try to add torrent2, qBittorrent should warn me that proceeding can result in data loss for torrent1 (this is only related to qBittorrent's code, I think). If I ignore the error and later recheck torrent1, the extra DEADBEEF should get truncated (this is the part related to libtorrent). That's right, except libtorrend won't truncate it. And if the two torrents happen to try to open the files at the same time, there will be an error. libtorrent just doesn't keep an extra data structure for the sole purpose of detecting conflicts on files. > Of course, this results in data loss with respect to torrent2, but it is the users fault for ignoring this. Rechecking and resuming torrent2 would then trigger the redownload of the missing pieces. from libtorrent's point of view, it would be the client warning about this, not libtorrent. There is a cost associated with keeping this extra data structure, that I don't think is relevant to all users of libtorrent.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Nov 8, 2018):

would it make more sense to always truncate a file when writing the last piece to it? That could still cause data loss, but not from a paused torrent that's just being checked.

@arvidn commented on GitHub (Nov 8, 2018): would it make more sense to always truncate a file when writing the last piece to it? That could still cause data loss, but not from a paused torrent that's just being checked.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@erasmux commented on GitHub (Nov 8, 2018):

When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%.

This would be catastrophic and would disable any option to cross-seed without creating an actual copy on the disk (even symbolic links won't help if qB would lock the file).

qB's current cross-seeding behavior is near perfect (imho), I can and do cross-seed a torrent which adds additional files while actively seeding a torrent with less files, which results in the last piece of a file in the first torrent to be redownloaded by the second one. I have no idea how it works but it works flawlessly.

Could this be abused to cause one torrent to cause an error on a different one? yes. But I don't see a problem with this. The behavior is clean and simple and as far as I know inline with other torrent clients.

Please don't break this relatively important feature for some very rare edge case.
Is there a logical scenario which brings to the situation we are discussing (rechecking a file which has the exact correct contents just with a few extra characters)?

would it make more sense to always truncate a file when writing the last piece to it? That could still cause data loss, but not from a paused torrent that's just being checked.

The problem is that currently rechecking such a file would result in 100% complete and the torrent would start seeding without redownloading the last piece. Like I offered before, I think that failing the check on the last piece of a file if it is longer than the torrent data could be a good solution/workaround (of course coupled with truncating the file when the last piece is redownloaded).

@erasmux commented on GitHub (Nov 8, 2018): > When a torrent is active, it owns the file and the client should do everything necessary to ensure that file contents match torrent 100%. This would be catastrophic and would disable any option to cross-seed without creating an actual copy on the disk (even symbolic links won't help if qB would lock the file). qB's current cross-seeding behavior is near perfect (imho), I can and do cross-seed a torrent which adds additional files while actively seeding a torrent with less files, which results in the last piece of a file in the first torrent to be redownloaded by the second one. I have no idea how it works but it works flawlessly. Could this be abused to cause one torrent to cause an error on a different one? yes. But I don't see a problem with this. The behavior is clean and simple and as far as I know inline with other torrent clients. Please don't break this relatively important feature for some very rare edge case. Is there a logical scenario which brings to the situation we are discussing (rechecking a file which has the exact correct contents just with a few extra characters)? > would it make more sense to always truncate a file when writing the last piece to it? That could still cause data loss, but not from a paused torrent that's just being checked. The problem is that currently rechecking such a file would result in 100% complete and the torrent would start seeding without redownloading the last piece. Like I offered before, I think that failing the check on the last piece of a file if it is longer than the torrent data could be a good solution/workaround (of course coupled with truncating the file when the last piece is redownloaded).

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Nov 8, 2018):

@erasmux

This would be catastrophic and would disable any option to cross-seed without creating an actual copy on the disk (even symbolic links won't help if qB would lock the file).

No, it wouldn't (besides, nobody said anything about "locking" the file?). What me and @sandersaares were saying was that qBittorrent wouldn't append bytes to existing files, but would let you save extra files to already existing folders. This way, cross seeding would work just fine.

@arvidn
Wouldn't you agree that

EDIT: adding a few characters at the end of the file does exhibit the behavior you describe.

Is a problem?

EDIT:

It just won't truncate the existing files, but it may extend it

This is exactly what I am talking about. This would cause said files to fail checksums

@FranciscoPombal commented on GitHub (Nov 8, 2018): @erasmux > This would be catastrophic and would disable any option to cross-seed without creating an actual copy on the disk (even symbolic links won't help if qB would lock the file). No, it wouldn't (besides, nobody said anything about "locking" the file?). What me and @sandersaares were saying was that qBittorrent wouldn't append bytes to existing files, but would let you save extra files to already existing folders. This way, cross seeding would work just fine. @arvidn Wouldn't you agree that > EDIT: adding a few characters at the end of the file does exhibit the behavior you describe. Is a problem? EDIT: > It just won't truncate the existing files, but it may extend it This is exactly what I am talking about. This would cause said files to fail checksums

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Nov 8, 2018):

from libtorrent's point of view, it would be the client warning about this, not libtorrent. There is a cost associated with keeping this extra data structure, that I don't think is relevant to all users of libtorrent.

Sure, I agree the part about keeping track of conflicts between files is at most a responsibility of the application.

However, what I am saying is libtorrent not truncating files on recheck (in case they have bytes appended to them) is most likely not the behavior that people expect.

@FranciscoPombal commented on GitHub (Nov 8, 2018): > from libtorrent's point of view, it would be the client warning about this, not libtorrent. There is a cost associated with keeping this extra data structure, that I don't think is relevant to all users of libtorrent. Sure, I agree the part about keeping track of conflicts between files is at most a responsibility of the application. However, what I am saying is libtorrent _not_ truncating files on recheck (in case they have bytes appended to them) is most likely not the behavior that people expect.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@erasmux commented on GitHub (Nov 8, 2018):

However, what I am saying is libtorrent not truncating files on recheck (in case they have bytes appended to them) is most likely not the behavior that people expect.

qB / libtorrent causing data loss (i.e. by truncating a file) is definitely not the behavior anyone could expect from a recheck on a paused torrent.

@erasmux commented on GitHub (Nov 8, 2018): >However, what I am saying is libtorrent not truncating files on recheck (in case they have bytes appended to them) is most likely not the behavior that people expect. qB / libtorrent causing data loss (i.e. by truncating a file) is definitely not the behavior anyone could expect from a recheck on a paused torrent.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Nov 8, 2018):

@FranciscoPombal yeah, I think people's model of what it means to check a torrent is to ask the question: do I have the data? i.e. that it should be a read-only operation

@arvidn commented on GitHub (Nov 8, 2018): @FranciscoPombal yeah, I think people's model of what it means to check a torrent is to ask the question: do I have the data? i.e. that it should be a read-only operation

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Nov 8, 2018):

@erasmux

qB / libtorrent causing data loss (i.e. by truncating a file) is definitely not the behavior anyone could expect from a recheck on a paused torrent.

It's not data loss, it is discarding incorrect data with respect to a certain torrent. Note that I am not saying that files should be deleted, just that they should be truncated.

@arvidn

@FranciscoPombal yeah, I think people's model of what it means to check a torrent is to ask the question: do I have the data? i.e. that it should be a read-only operation.

Maybe it is just me, but my expectation for the recheck functionality is a bit higher; it answers the question: Do I have the correct data? A file that has garbage appended to it should be truncated when answering this question.

@FranciscoPombal commented on GitHub (Nov 8, 2018): @erasmux > qB / libtorrent causing data loss (i.e. by truncating a file) is definitely not the behavior anyone could expect from a recheck on a paused torrent. It's not data loss, it is discarding incorrect data with respect to a certain torrent. Note that I am not saying that files should be deleted, just that they should be truncated. @arvidn > @FranciscoPombal yeah, I think people's model of what it means to check a torrent is to ask the question: do I have the data? i.e. that it should be a read-only operation. Maybe it is just me, but my expectation for the recheck functionality is a bit higher; it answers the question: Do I have the _correct_ data? A file that has garbage appended to it should be truncated when answering this question.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Jul 31, 2020):

I think my opinion about this behavior has changed since my previous posts.

The current behavior of ignoring appended data (i.e., treating recheck as a read-only operation) is the better one for cross-seeding and a purely file-sharing standpoint - libtorrent will never send the appended data to any peer anyway.

However, I think ignoring appended data may subvert some expectations of file integrity/verification. I think that people expect that a rechecked torrent also means "if I run these files through sha256sum, I will always get the same result". This will not happen in general if arbitrary data can be appended.

So here's what I'm thinking: perhaps the recheck feature could be extended to put the torrent in some kind of warning state/category of "data appended" (this could be some kind of per-torrent flag that libtorrent sets). This would mean "the data is all there and the torrent is able to upload/download correctly, but be warned that, from a file-level integrity standpoint, there are extra bytes here and there". Perhaps libtorrent could also expose which files in a given torrent have data appended and how much.
Additionally, a control to "delete all appended data" would probably be nice (this could be a context-menu option in qBIttorrent, similar to "force recheck"), otherwise the user is always forced to delete the appended data manually, if they decide they no longer needed.

@arvidn thoughts?

@FranciscoPombal commented on GitHub (Jul 31, 2020): I think my opinion about this behavior has changed since my previous posts. The current behavior of ignoring appended data (i.e., treating recheck as a read-only operation) is the better one for cross-seeding and a purely file-sharing standpoint - libtorrent will never send the appended data to any peer anyway. However, I think ignoring appended data may subvert some expectations of file integrity/verification. I think that people expect that a rechecked torrent also means "if I run these files through `sha256sum`, I will always get the same result". This will not happen in general if arbitrary data can be appended. So here's what I'm thinking: perhaps the recheck feature could be extended to put the torrent in some kind of warning state/category of "data appended" (this could be some kind of per-torrent flag that libtorrent sets). This would mean "the data is all there and the torrent is able to upload/download correctly, but be warned that, from a file-level integrity standpoint, there are extra bytes here and there". Perhaps libtorrent could also expose which files in a given torrent have data appended and how much. Additionally, a control to "delete all appended data" would probably be nice (this could be a context-menu option in qBIttorrent, similar to "force recheck"), otherwise the user is always forced to delete the appended data manually, if they decide they no longer needed. @arvidn thoughts?

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Jul 31, 2020):

I'm not sure about a state, but an alert might make sense. Perhaps there should be a separate operation than "recheck" too, something like "repair".

I can also imagine a flag to force_recheck(), similar to move_storage(), that enables truncation of files.

@arvidn commented on GitHub (Jul 31, 2020): I'm not sure about a state, but an alert might make sense. Perhaps there should be a separate operation than "recheck" too, something like "repair". I can also imagine a flag to `force_recheck()`, similar to `move_storage()`, that enables truncation of files.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Aug 4, 2020):

@arvidn

I'm not sure about a state, but an alert might make sense.

libtorrent could fire an alert, and qBittorrent could change states based on that (internal qBittorrent states, nothing to do with libtorrent). This alert could fire for a certain torrent:

every time data is appended to a file belonging to the torrent
when checking the fastresume; information about whether or not there is appended data could be stored in the fastresume
at the end of a recheck, if any file contains appended data

Perhaps there should be a separate operation than "recheck" too, something like "repair".

👍

I can also imagine a flag to force_recheck(), similar to move_storage(), that enables truncation of files.

Do you mean that would be a possible way to implement the "repair" operation? Sounds good.

@FranciscoPombal commented on GitHub (Aug 4, 2020): @arvidn > I'm not sure about a state, but an alert might make sense. libtorrent could fire an alert, and qBittorrent could change states based on that (internal qBittorrent states, nothing to do with libtorrent). This alert could fire for a certain torrent: - every time data is appended to a file belonging to the torrent - when checking the fastresume; information about whether or not there is appended data could be stored in the fastresume - at the end of a recheck, if any file contains appended data >Perhaps there should be a separate operation than "recheck" too, something like "repair". :+1: > I can also imagine a flag to `force_recheck()`, similar to `move_storage()`, that enables truncation of files. Do you mean that would be a possible way to implement the "repair" operation? Sounds good.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@jbruchon commented on GitHub (Aug 26, 2020):

This seems pretty straightforward. Give the user the choice over the behavior. If they want truncation or want appended data to be allowed then they can make that choice in the preferences.

There also needs to be an option to auto-rename when there is a file collision between two torrents in the same program. I have regularly attempted to download separate files that have the same name only for the program to then merge them naively. The current behavior doesn't make sense and needs to be addressed. I was shocked to discover that this issue is so old.

I don't know what is meant by "cross-seeding" but I assume it means loading multiple torrents with overlapping data. If the files differ in size then they are not duplicates. Sure, the smaller one could be identical to the same sized segment at the start of the larger one, but to always act like this is the case is dangerous. If you download two differently sized MP4 videos, for example, they're not the same file with an extremely high degree of certainty. In the case of MP4 files in particular, the smaller file's moov atom (start or end) can get nuked by the larger file data, and that renders the entire smaller file unusable. Likewise, if the header is the one for the smaller file, the larger file becomes unusable even if the moov atom is at the end.

tl;dr: differently sized files are not identical and shouldn't be treated as such by default.

@jbruchon commented on GitHub (Aug 26, 2020): This seems pretty straightforward. Give the user the choice over the behavior. If they want truncation or want appended data to be allowed then they can make that choice in the preferences. There also needs to be an option to auto-rename when there is a file collision between two torrents in the same program. I have regularly attempted to download separate files that have the same name only for the program to then merge them naively. The current behavior doesn't make sense and needs to be addressed. I was shocked to discover that this issue is so old. I don't know what is meant by "cross-seeding" but I assume it means loading multiple torrents with overlapping data. If the files differ in size then they are not duplicates. Sure, the smaller one could be identical to the same sized segment at the start of the larger one, but to always act like this is the case is dangerous. If you download two differently sized MP4 videos, for example, they're not the same file with an extremely high degree of certainty. In the case of MP4 files in particular, the smaller file's moov atom (start or end) can get nuked by the larger file data, and that renders the entire smaller file unusable. Likewise, if the header is the one for the smaller file, the larger file becomes unusable even if the moov atom is at the end. tl;dr: differently sized files are not identical and shouldn't be treated as such by default.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@FranciscoPombal commented on GitHub (Aug 26, 2020):

I don't know what is meant by "cross-seeding" but I assume...

Cross-seeding means downloading torrent A from tracker X, and seeding it to both tracker X and another tracker Y. If tracker Y contains torrent B, such that A is a superset of B, A is also good for seeding B. In this context, "superset" means "additional files/directories", and very rarely "bytes appended to files". This latter case is the most relevant to this discussion, as I wouldn't expect a truncation function to actually delete additional files, just additional bytes in certain files. But this could also be configurable, e.g. "No truncation" | "Truncate files" | "Truncate directory contents" | "Truncate files and directory contents".

@FranciscoPombal commented on GitHub (Aug 26, 2020): > I don't know what is meant by "cross-seeding" but I assume... Cross-seeding means downloading torrent A from tracker X, and seeding it to both tracker X and another tracker Y. If tracker Y contains torrent B, such that A is a superset of B, A is also good for seeding B. In this context, "superset" means "additional files/directories", and very rarely "bytes appended to files". This latter case is the most relevant to this discussion, as I wouldn't expect a truncation function to actually delete additional files, just additional bytes in certain files. But this could also be configurable, e.g. "No truncation" | "Truncate files" | "Truncate directory contents" | "Truncate files and directory contents".

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@escape0707 commented on GitHub (Feb 5, 2022):

Besides, when would people legitimately want to download another torrent to some existing filename?

Wow, actually got hit by this bug this week. Can't rehash to get an updated working copy of a program.

Really wish this will be fix in some way. Half of my week was ruined and wasted on this... Have to use other clients like aria2 for this now:

aria2c -T <path to torrent file> -d <path to download folder> -V

@escape0707 commented on GitHub (Feb 5, 2022): > Besides, when would people legitimately want to download another torrent to some existing filename? Wow, actually got hit by this bug this week. Can't rehash to get an updated working copy of a program. Really wish this will be fix in some way. Half of my week was ruined and wasted on this... Have to use other clients like aria2 for this now: ```shell aria2c -T <path to torrent file> -d <path to download folder> -V ```

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@escape0707 commented on GitHub (Feb 5, 2022):

This is exactly what I was talking about. I had this behavior back when I was using utorrent and it was really annoying when a simple typo or selecting the wrong destination folder and then running a supposedly non-destructive recheck on a paused torrent would truncate the file, resulting in loss of data.

I'd suggest to implement some flag / queue for truncation that only take place when the user really want to download / modifying.

Sometimes I like to I add a torrent stopped and manually force a recheck. After the recheck qbittorrent just report me with what it think is wrong / missing without actually modifying them. Modification only happens when I say "download" later.

I like to have this kind of safety and I hope truncation only happens after user confirmation of some kind of change could happen.

But I do think that silently ignoring appending bits which actually changes the checksums of a file when "rehashing" / "rechecking" is extremely counterintuitive.

@escape0707 commented on GitHub (Feb 5, 2022): > This is exactly what I was talking about. I had this behavior back when I was using utorrent and it was really annoying when a simple typo or selecting the wrong destination folder and then running a supposedly non-destructive recheck on a paused torrent would truncate the file, resulting in loss of data. I'd suggest to implement some flag / queue for truncation that only take place when the user really want to download / modifying. Sometimes I like to I add a torrent stopped and manually force a recheck. After the recheck qbittorrent just report me with what it think is wrong / missing without actually modifying them. Modification only happens when I say "download" later. I like to have this kind of safety and I hope truncation only happens after user confirmation of some kind of change could happen. But I **do** think that silently ignoring appending bits which actually changes the checksums of a file when "rehashing" / "rechecking" is **extremely counterintuitive**.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@escape0707 commented on GitHub (Feb 5, 2022):

I think my opinion about this behavior has changed since my previous posts.

The current behavior of ignoring appended data (i.e., treating recheck as a read-only operation) is the better one for cross-seeding and a purely file-sharing standpoint - libtorrent will never send the appended data to any peer anyway.

However, I think ignoring appended data may subvert some expectations of file integrity/verification. I think that people expect that a rechecked torrent also means "if I run these files through sha256sum, I will always get the same result". This will not happen in general if arbitrary data can be appended.

So here's what I'm thinking: perhaps the recheck feature could be extended to put the torrent in some kind of warning state/category of "data appended" (this could be some kind of per-torrent flag that libtorrent sets). This would mean "the data is all there and the torrent is able to upload/download correctly, but be warned that, from a file-level integrity standpoint, there are extra bytes here and there". Perhaps libtorrent could also expose which files in a given torrent have data appended and how much.

Additionally, a control to "delete all appended data" would probably be nice (this could be a context-menu option in qBIttorrent, similar to "force recheck"), otherwise the user is always forced to delete the appended data manually, if they decide they no longer needed.

@arvidn thoughts?

Not the feature we deserve, but the feature we need.

@escape0707 commented on GitHub (Feb 5, 2022): > I think my opinion about this behavior has changed since my previous posts. > > > > The current behavior of ignoring appended data (i.e., treating recheck as a read-only operation) is the better one for cross-seeding and a purely file-sharing standpoint - libtorrent will never send the appended data to any peer anyway. > > > > However, I think ignoring appended data may subvert some expectations of file integrity/verification. I think that people expect that a rechecked torrent also means "if I run these files through `sha256sum`, I will always get the same result". This will not happen in general if arbitrary data can be appended. > > > > So here's what I'm thinking: perhaps the recheck feature could be extended to put the torrent in some kind of warning state/category of "data appended" (this could be some kind of per-torrent flag that libtorrent sets). This would mean "the data is all there and the torrent is able to upload/download correctly, but be warned that, from a file-level integrity standpoint, there are extra bytes here and there". Perhaps libtorrent could also expose which files in a given torrent have data appended and how much. > > Additionally, a control to "delete all appended data" would probably be nice (this could be a context-menu option in qBIttorrent, similar to "force recheck"), otherwise the user is always forced to delete the appended data manually, if they decide they no longer needed. > > > > @arvidn thoughts? Not the feature we deserve, but the feature we need.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Feb 5, 2022):

there are very compelling arguments against truncating files and discard data as well.
An option in libtorrent seems like a much more complex solution than it needs to be. iterating the files in a torrent and checking the file sizes on disk, and truncate, doesn't need to be part of the normal download flow.

That could just be a function, separate from downloading, could it not?

@arvidn commented on GitHub (Feb 5, 2022): there are very compelling arguments against truncating files and discard data as well. An option in libtorrent seems like a much more complex solution than it needs to be. iterating the files in a torrent and checking the file sizes on disk, and truncate, doesn't need to be part of the normal download flow. That could just be a function, separate from downloading, could it not?

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@escape0707 commented on GitHub (Feb 5, 2022):

Torrent-wise flag seems too complicated for this rare use case. I think a dedicated function in lib and a separated button for enforcing hash-perfection is okay. ~~Although other BT clients don't seem to even bother and just truncate.~~ Aria2 just truncate, but Transmission BT will behave the same as libtorrent.

I do think an alert should be implemented both in the lib and the client.

@escape0707 commented on GitHub (Feb 5, 2022): Torrent-wise flag seems too complicated for this rare use case. I think a dedicated function in lib and a separated button for enforcing hash-perfection is okay. ~~Although other BT clients don't seem to even bother and just truncate.~~ Aria2 just truncate, but Transmission BT will behave the same as libtorrent. I do think an alert should be implemented both in the lib and the client.

deekerman commented

2026-02-21 19:17:00 -05:00

Author

Owner

@arvidn commented on GitHub (Feb 7, 2022):

how does this look? https://github.com/arvidn/libtorrent/pull/6726

@arvidn commented on GitHub (Feb 7, 2022): how does this look? https://github.com/arvidn/libtorrent/pull/6726

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@glassez commented on GitHub (Feb 8, 2022):

there are very compelling arguments against truncating files and discard data as well.

Like what?
The only argument, IMO, is that some file can be valid as a whole, or some continuous part of it from the beginning to a certain position, so that it can be in one torrent completely, and in another - in a truncated form. But the question is, what types of files (with the exception of plain text) are suitable for this? And how likely is the situation with torrents described above?
Another question is how libtorrent behaves if it checks a file that has a suitable (or smaller size), while some of its initial part satisfies the torrent hashes, and the subsequent part does not? Why not make the behavior consistent?

Of course, no changes should be applied at the "checking" stage, but at the "downloading" stage.

@glassez commented on GitHub (Feb 8, 2022): > there are very compelling arguments against truncating files and discard data as well. Like what? The only argument, IMO, is that some file can be valid as a whole, or some continuous part of it from the beginning to a certain position, so that it can be in one torrent completely, and in another - in a truncated form. But the question is, what types of files (with the exception of plain text) are suitable for this? And how likely is the situation with torrents described above? Another question is how libtorrent behaves if it checks a file that has a suitable (or smaller size), while some of its initial part satisfies the torrent hashes, and the subsequent part does not? Why not make the behavior consistent? Of course, no changes should be applied at the "checking" stage, but at the "downloading" stage.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@escape0707 commented on GitHub (Feb 8, 2022):

Can't understand why people would download load through torrenting and expect to get files different from listed in the torrent.

https://cs.rin.ru/forum/viewtopic.php?p=2537298#p2537298

@escape0707 commented on GitHub (Feb 8, 2022): Can't understand why people would download load through torrenting and expect to get files different from listed in the torrent. https://cs.rin.ru/forum/viewtopic.php?p=2537298#p2537298

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@arvidn commented on GitHub (Feb 8, 2022):

Like what?

There's a long thread somewhere on github on this issue. my recollection of the main gist is that you may have files you care about in your download folder, and one torrent (that perhaps shouldn't be trusted) can wipe them out. Without truncating files, the damage done is proportional to how long it takes you to realize it's a bad/malicious torrent.

@arvidn commented on GitHub (Feb 8, 2022): > Like what? There's a long thread somewhere on github on this issue. my recollection of the main gist is that you may have files you care about in your download folder, and one torrent (that perhaps shouldn't be trusted) can wipe them out. Without truncating files, the damage done is proportional to how long it takes you to realize it's a bad/malicious torrent.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@escape0707 commented on GitHub (Feb 8, 2022):

I think trying to update with rehashing and failed to do so because the bt client doesn't do checksum perfect recheck is a lot harder to realize than downloading a bad torrent without cautious.

If the latter happened, the user will think like "damn, my fault". But when the former happened, the user will have no idea why the update won't work as expected until they happened / managed to get a file verification other than the torrent it self, and have to switch to another bt client for the update.

@escape0707 commented on GitHub (Feb 8, 2022): I think trying to update with rehashing and failed to do so because the bt client doesn't do checksum perfect recheck is a lot harder to realize than downloading a bad torrent without cautious. If the latter happened, the user will think like "damn, my fault". But when the former happened, the user will have no idea why the update won't work as expected until they happened / managed to get a file verification other than the torrent it self, and have to switch to another bt client for the update.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@arvidn commented on GitHub (Feb 9, 2022):

Just a note on the suggestion in the ticket; rechecking needs to be read-only. It needs to support read-only media for instance. That said, a higher level (user facing) function could still combine rechecking with trimming/truncating.

@arvidn commented on GitHub (Feb 9, 2022): Just a note on the suggestion in the ticket; rechecking needs to be read-only. It needs to support read-only media for instance. That said, a higher level (user facing) function could still combine rechecking with trimming/truncating.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@glassez commented on GitHub (Feb 9, 2022):

rechecking needs to be read-only. It needs to support read-only media for instance.

Of course.
But why not provide more checking? IMO, it should provide more detections for inconsistencies between torrent and existing data and prevent torrent from being started in case if any (unless user explicitly tell it to overwrite it, at least by setting default policy about it). I mean the following error cases:

Some file(s) are larger than the size of the corresponding file from the torrent, even if the hashes of the matching pieces are the same (as in this Issue).
The hashes of some pieces of the file do not match the hashes of the corresponding pieces from the torrent.

@glassez commented on GitHub (Feb 9, 2022): > rechecking needs to be read-only. It needs to support read-only media for instance. Of course. But why not provide more checking? IMO, it should provide more detections for inconsistencies between torrent and existing data and prevent torrent from being started in case if any (unless user explicitly tell it to overwrite it, at least by setting default policy about it). I mean the following error cases: 1. Some file(s) are larger than the size of the corresponding file from the torrent, even if the hashes of the matching pieces are the same (as in this Issue). 2. The hashes of some pieces of the file do not match the hashes of the corresponding pieces from the torrent.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@arvidn commented on GitHub (Feb 9, 2022):

But why not provide more checking?

You mean something like this? https://github.com/arvidn/libtorrent/pull/6726

It's not obvious to me that the checking of torrents needs more options. It seems reasonable for clients to compose the various features in libtorrent into higher level functionality. I'm open to be convinced it belongs in the rechecking logic.

I would be a bit uneasy of failing rechecking because of appended garbage. That would prevent seeding the content, which I think would be a disservice to the swarm. Having some user visible notification might make sense though.

@arvidn commented on GitHub (Feb 9, 2022): > But why not provide more checking? You mean something like this? https://github.com/arvidn/libtorrent/pull/6726 It's not obvious to me that the checking of torrents needs more options. It seems reasonable for clients to compose the various features in libtorrent into higher level functionality. I'm open to be convinced it belongs in the rechecking logic. I would be a bit uneasy of failing rechecking because of appended garbage. That would prevent seeding the content, which I think would be a disservice to the swarm. Having some user visible notification might make sense though.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@glassez commented on GitHub (Feb 9, 2022):

But why not provide more checking?

You mean something like this? arvidn/libtorrent#6726

It just truncates. So you need either apply it unconditionally or check files by yourself.

It seems reasonable for clients to compose the various features in libtorrent into higher level functionality.

It's acceptable in case if it is possible in a more or less convenient way.
If we're talking about oversized files then we can check them at application level, notify the user and truncate them if needed. There is no problem if such files remain untruncated for some time.
But how about the files with the same names as in torrent (and maybe even the same size) but different content? If someone points such a torrent to this content by mistake (or maybe there is malicious torrent) how libtorrent will behave? I would expect the data to remain untouched unless I explicitly allow it to overwrite.

@glassez commented on GitHub (Feb 9, 2022): > > But why not provide more checking? > > You mean something like this? [arvidn/libtorrent#6726](https://github.com/arvidn/libtorrent/pull/6726) It just truncates. So you need either apply it unconditionally or check files by yourself. >It seems reasonable for clients to compose the various features in libtorrent into higher level functionality. It's acceptable in case if it is possible in a more or less convenient way. If we're talking about oversized files then we can check them at application level, notify the user and truncate them if needed. There is no problem if such files remain untruncated for some time. But how about the files with the same names as in torrent (and maybe even the same size) but different content? If someone points such a torrent to this content by mistake (or maybe there is malicious torrent) how libtorrent will behave? I would expect the data to remain untouched unless I explicitly allow it to overwrite.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@arvidn commented on GitHub (Feb 9, 2022):

If we're talking about oversized files then we can check them at application level, notify the user and truncate them if needed. There is no problem if such files remain untruncated for some time.

Maybe it would make sense to issue an alert during checking when oversized files are encountered.

But how about the files with the same names as in torrent (and maybe even the same size) but different content? If someone points such a torrent to this content by mistake (or maybe there is malicious torrent) how libtorrent will behave?

It depends on the flags. Generally it will silently overwrite (once it starts downloading). There's an option to fully trust the resume data and not check the files at all (also deferring any disk issues until they are touched for the first time).

But generally, there isn't a good solution to different torrents sharing a file on disk. It could get quite expensive to maintain an index to check against.

@arvidn commented on GitHub (Feb 9, 2022): > If we're talking about oversized files then we can check them at application level, notify the user and truncate them if needed. There is no problem if such files remain untruncated for some time. Maybe it would make sense to issue an alert during checking when oversized files are encountered. > But how about the files with the same names as in torrent (and maybe even the same size) but different content? If someone points such a torrent to this content by mistake (or maybe there is malicious torrent) how libtorrent will behave? It depends on the flags. Generally it will silently overwrite (once it starts downloading). There's an option to fully trust the resume data and not check the files at all (also deferring any disk issues until they are touched for the first time). But generally, there isn't a good solution to different torrents sharing a file on disk. It could get quite expensive to maintain an index to check against.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@glassez commented on GitHub (Feb 11, 2022):

Maybe it would make sense to issue an alert during checking when oversized files are encountered.

I think Yes. At least this will save application form checking the files again.

@glassez commented on GitHub (Feb 11, 2022): > Maybe it would make sense to issue an alert during checking when oversized files are encountered. I think Yes. At least this will save application form checking the files again.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@glassez commented on GitHub (Feb 11, 2022):

But generally, there isn't a good solution to different torrents sharing a file on disk.

I don't talk about such a case.
I talk about adding torrent and pointing it to the location with existing data which, according to the user, should match the torrent, but in fact do not match. It is not difficult to imagine such a case, even without taking into account possible malicious torrents. For example, I want to support some kind of distribution by having (as it seems to me) the corresponding file on my disk, but it is actually different (for example, this is a video file encoded a little differently).

@glassez commented on GitHub (Feb 11, 2022): > But generally, there isn't a good solution to different torrents sharing a file on disk. I don't talk about such a case. I talk about adding torrent and pointing it to the location with existing data which, according to the user, should match the torrent, but in fact do not match. It is not difficult to imagine such a case, even without taking into account possible malicious torrents. For example, I want to support some kind of distribution by having (as it seems to me) the corresponding file on my disk, but it is actually different (for example, this is a video file encoded a little differently).

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@FimbulvetrOW commented on GitHub (Mar 2, 2023):

The simple example: I have Folder with old content. There is a torrent with new content. I want to use my existing data, rehash it, and download only what I'm missing and everything that changed, leaving things that didn't change intact - which ideally means I have to download a lot less than the entire new torrent.

This fails with this client because if there are files that are now smaller in the new torrent, they will not get truncated, garbage data at the end remains. Depending on the file, this can cause no problems at all, or cause catastrophic failure, or anything in between.

How is "truncate files on re-check torrent" not an option yet?

@FimbulvetrOW commented on GitHub (Mar 2, 2023): The simple example: I have Folder with old content. There is a torrent with new content. I want to use my existing data, rehash it, and download only what I'm missing and everything that changed, leaving things that didn't change intact - which ideally means I have to download a lot less than the entire new torrent. This fails with this client because if there are files that are now smaller in the new torrent, they will not get truncated, garbage data at the end remains. Depending on the file, this can cause no problems at all, or cause catastrophic failure, or anything in between. How is "truncate files on re-check torrent" not an option yet?

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@hormoz1 commented on GitHub (May 11, 2023):

I agree that this behavior is unexpected and can cause problems when you are trying to recheck/update torrent files from another place, in which you are expecting the files to be 100% identical when the recheck says 100%, not to have a bunch of data at the end.

Qbittorrent should optimally warn about this and allow you to correct it.

@hormoz1 commented on GitHub (May 11, 2023): I agree that this behavior is unexpected and can cause problems when you are trying to recheck/update torrent files from another place, in which you are expecting the files to be 100% identical when the recheck says 100%, not to have a bunch of data at the end. Qbittorrent should optimally warn about this and allow you to correct it.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@AetherMagee commented on GitHub (Oct 7, 2023):

Am I misunderstanding something? How is this not fixed yet? I understand that there are, in fact, use cases for such behaviour, but it clearly creates way more problems than it solves

@AetherMagee commented on GitHub (Oct 7, 2023): Am I misunderstanding something? How is this not fixed yet? I understand that there are, in fact, use cases for such behaviour, but it clearly creates way more problems than it solves

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@escape0707 commented on GitHub (Oct 7, 2023):

This is theoretically already fixed in the library side by the introduction of a new dedicated truncate function mentioned previously: https://github.com/arvidn/libtorrent/pull/6726

But I don't think this function can use the mismatch info found in the hashing phase to optimize out redundant recheck of file size.

I also don't know if qBT is currently using it or not, but a simple search of the function name in this repo should do the trick.

@escape0707 commented on GitHub (Oct 7, 2023): This is theoretically already fixed in the library side by the introduction of a new dedicated truncate function mentioned previously: https://github.com/arvidn/libtorrent/pull/6726 But I don't think this function can use the mismatch info found in the hashing phase to optimize out redundant recheck of file size. I also don't know if qBT is currently using it or not, but a simple search of the function name in this repo should do the trick.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@USSWashington commented on GitHub (Mar 5, 2024):

@escape0707, any updates on whether qBit has implemented this feature?

@USSWashington commented on GitHub (Mar 5, 2024): @escape0707, any updates on whether qBit has implemented this feature?

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@escape0707 commented on GitHub (Mar 5, 2024):

@NonVolatileHuman Sadly, I just did a search of this repo's code, and truncate_files function is not used anywhere. So no, they didn't. I do think this is an important issue, tho.

@escape0707 commented on GitHub (Mar 5, 2024): @NonVolatileHuman Sadly, I just did a search of this repo's code, and `truncate_files` function is not used anywhere. So no, they didn't. I do think this is an important issue, tho.

deekerman commented

2026-02-21 19:17:01 -05:00

Author

Owner

@SmilingSpectre commented on GitHub (Dec 10, 2024):

There's a long thread somewhere on github on this issue. my recollection of the main gist is that you may have files you care about in your download folder, and one torrent (that perhaps shouldn't be trusted) can wipe them out. Without truncating files, the damage done is proportional to how long it takes you to realize it's a bad/malicious torrent.

Sorry for being late to the event, but I realized the problem just now, and cannot understand fully your point of view.

How I see the issue:

If you overwrite the old file with the new, lesser one, you always lose the old file anyway. Well, you still may have the last 2kb of old .zip archive, or second half of jpg image, but what use of it? But simultaneously it will throw errors for the new files (I am sure about archives, not so sure about other formats). And it's regardless of torrent checking successfully for 100%. I cannot imagine also the case when user expects to have the old files and the new files with the same name and in the same place.
You are right about catastrophic consequences for truncating good files with malicious content. But I believe, it's can be totally negated if you make truncate check after file is loaded and confirmed, not at the start of checking. Then you can potentially lost info during loading only for already overwritten part - or the full file, but only after the new file is confirmed. But in that case you already lost it anyway, truncated or not.
Only potentially useful current outcome which isn't covered in 1 and 2 is if you have a minor damage to your files, and your files are insensitive to modification at all. But what can it be? I can imagine only non-archived docs? Looks quite a minor case, especially comparing with inevitable and guaranteed corruption of any archive file in such overwrite.

@SmilingSpectre commented on GitHub (Dec 10, 2024): > There's a long thread somewhere on github on this issue. my recollection of the main gist is that you may have files you care about in your download folder, and one torrent (that perhaps shouldn't be trusted) can wipe them out. Without truncating files, the damage done is proportional to how long it takes you to realize it's a bad/malicious torrent. Sorry for being late to the event, but I realized the problem just now, and cannot understand fully your point of view. How I see the issue: 1. If you overwrite the old file with the new, lesser one, you always lose the old file anyway. Well, you still may have the last 2kb of old .zip archive, or second half of jpg image, but what use of it? But simultaneously it _will_ throw errors for the new files (I am sure about archives, not so sure about other formats). And it's regardless of torrent checking successfully for 100%. I cannot imagine also the case when user _expects_ to have the old files and the new files with the same name and in the same place. 2. You are right about catastrophic consequences for truncating good files with malicious content. But I believe, it's can be totally negated if you make truncate check _after_ file is loaded and confirmed, not at the start of checking. Then you can potentially lost info during loading only for already overwritten part - or the full file, but only after the new file is confirmed. But in that case you already lost it anyway, truncated or not. 3. Only potentially useful current outcome which isn't covered in 1 and 2 is if you have a minor damage to your files, and your files are insensitive to modification at all. But what can it be? I can imagine only non-archived docs? Looks quite a minor case, especially comparing with inevitable and guaranteed corruption of any archive file in such overwrite.

Rows
Columns

Torrent rechecks to 100% despite appended data - files are not truncated #7990

qBittorrent version and Operating System

What is the problem

What is the expected behavior

Steps to reproduce

Extra info(if any)