Linux BTRFS: Pre-allocate files = zero-fill or sparse #13100

Closed
opened 2026-02-21 23:33:13 -05:00 by deekerman · 5 comments
Owner

Originally created by @Arcitec on GitHub (Feb 20, 2022).

When qBittorrent is running on Linux and writing to a BTRFS disk, how does "Pre-allocate disk space for all files" behave?

  • A: Fills the file with zeroes and then downloads the real contents. (Write amplification for SSDs, bad stuff.)
  • B: Allocates a "sparse file" but doesn't write anything at all except the data that is being downloaded. (Would be great.)

It all comes down to which calls libtorrent does, and if BTRFS interprets those calls as sparse files or as zero-filled files... If it's zero-filled then I will immediately turn off the feature to save my SSD life!

My reason for enabling pre-allocation is to reduce fragmentation and to avoid surprises with disk space suddenly becoming zero when multiple torrents finish at the same time...

I've seen prior (locked) discussions that talk about pre-allocate and BTRFS but none of them illuminate the latest behavior, especially since it's been changed in libtorrent patches after these discussions:

Originally created by @Arcitec on GitHub (Feb 20, 2022). **When qBittorrent is running on Linux and writing to a BTRFS disk, how does "Pre-allocate disk space for all files" behave?** - A: Fills the file with zeroes and then downloads the real contents. (Write amplification for SSDs, bad stuff.) - B: Allocates a "sparse file" but doesn't write anything at all except the data that is being downloaded. (Would be great.) --- It all comes down to which calls libtorrent does, and if BTRFS interprets those calls as sparse files or as zero-filled files... If it's zero-filled then I will immediately turn off the feature to save my SSD life! My reason for enabling pre-allocation is to reduce fragmentation and to avoid surprises with disk space suddenly becoming zero when multiple torrents finish at the same time... I've seen prior (locked) discussions that talk about pre-allocate and BTRFS but none of them illuminate the latest behavior, especially since it's been changed in libtorrent patches after these discussions: - 2016: https://github.com/qbittorrent/qBittorrent/issues/4967 - 2020: https://github.com/qbittorrent/qBittorrent/issues/12991 - 2020 Patch: https://github.com/arvidn/libtorrent/commit/249def418a1f5ff142de60670100f454dbe8be2d
deekerman 2026-02-21 23:33:13 -05:00
Author
Owner

@USBhost commented on GitHub (Feb 22, 2022):

If I understand correctly it's B, at lest on XFS. P.S. it also works over NFS 4

@USBhost commented on GitHub (Feb 22, 2022): If I understand correctly it's B, at lest on XFS. P.S. it also works over NFS 4
Author
Owner

@Arcitec commented on GitHub (Feb 2, 2023):

@USBhost I just got back to this and decided to research it again. The "config.hpp" is so messily written in libtorrent that I had to use an IDE with code folding to try to separate all of the different platform "if/else" sections to see what is going on...

https://github.com/arvidn/libtorrent/blob/master/include/libtorrent/config.hpp

Here's what I concluded from analyzing it:

  • It sets TORRENT_HAS_FALLOCATE to 0 in a few special circumstances, such as certain operating systems or old versions of glibc library on Android, etc.
  • It doesn't seem to ever set it to 0 on Linux, but the file is such a goddamn mess that it's hard to say for sure.
  • If it hasn't been set to anything already, it then sets the TORRENT_HAS_FALLOCATE flag to 1 by default.

Here's what it does when that flag is set to enabled:

github.com/arvidn/libtorrent@bcdf76a6fc/src/file.cpp (L496-L506)

So if that flag is working properly on Linux, which it seems to be, then it means that libtorrent ALWAYS allocates files via posix_fallocate

That call is NEVER sparse. It is ALWAYS writing zeroes to the file.

However, certain filesystems can detect that it's being told to "WRITE 0 IN THE ENTIRE FILE" and just ignore the instruction.

Filesystems that ignore the instruction include the following:

  • ZFS: If COMPRESSION and "file holes/gaps" are enabled on the filesystem, it will detect the zeroes and basically ignore the instruction since it's being told to write nothing (empty blocks). But in other cases it WILL write zeroes to disk: https://www.reddit.com/r/zfs/comments/6spiky/what_does_zfs_on_linux_do_with_fallocate_andor/
  • BTRFS: By default, it uses copy-on-write and block de-duplication and compression. So all empty blocks (that just contain zero-bytes) will be hashed to the same block value ("an empty block") and therefore ignored. Instead of writing the empty blocks to disk, it will just keep track that those blocks are totally empty, so you will NOT get any write amplification on disk when writing empty blocks to files on BTRFS... BUT if you have manually disabled the BTRFS compression and de-duplication etc (maybe even if you disable Copy-on-Write on a file/folder), then it will write zeroes to disk, but most people don't disable that since that's the literal core of what makes BTRFS good.
  • XFS: I didn't research it so I will take your word that it also ignores empty blocks.
  • Other filesystems: It depends on the filesystem. I would highly suspect that Ext4 WILL WRITE zeroes to the disk, since it's an old and very basic filesystem that doesn't have much intelligence.

I will now close this since the investigation is complete. It's clear that libtorrent doesn't try to allocate sparse files. It tells the OS to zero-fill the files, and it's the filesystem's job to ignore the worthless zeroes and avoid writing to the disk.

The behavior differs on other operating systems, so don't take this as a universal law. I only investigated Linux, and only the filesystem I care about (BTRFS). Windows for example uses an entirely different Windows API to write the files instead.

But overall, it's clear that libtorrent doesn't care about trying to do sparse allocation. From what I hear, sparse allocation is mostly achieved by opening a file descriptor, seeking to the end (target length), writing a zero at the end, and then the OS itself would have to deal with the "gap" (sparse allocation) of the file. Libtorrent doesn't do that at all. It tells the OS to zero-fill the entire file, at least on Linux. It's up to the filesystem to deal with it.

@Arcitec commented on GitHub (Feb 2, 2023): @USBhost I just got back to this and decided to research it again. The "config.hpp" is so messily written in libtorrent that I had to use an IDE with code folding to try to separate all of the different platform "if/else" sections to see what is going on... https://github.com/arvidn/libtorrent/blob/master/include/libtorrent/config.hpp Here's what I concluded from analyzing it: - It sets `TORRENT_HAS_FALLOCATE` to 0 in a few special circumstances, such as certain operating systems or old versions of glibc library on Android, etc. - It doesn't seem to ever set it to `0` on Linux, but the file is such a goddamn mess that it's hard to say for sure. - If it hasn't been set to anything already, it then sets the `TORRENT_HAS_FALLOCATE` flag to `1` by default. Here's what it does when that flag is set to enabled: https://github.com/arvidn/libtorrent/blob/bcdf76a6fc0abfb919886af40367d5cfccd921ab/src/file.cpp#L496-L506 So if that flag is working properly on Linux, which it seems to be, then it means that libtorrent ALWAYS allocates files via [posix_fallocate](https://man7.org/linux/man-pages/man3/posix_fallocate.3.html) That call is NEVER sparse. It is ALWAYS writing zeroes to the file. However, certain filesystems can detect that it's being told to "WRITE 0 IN THE ENTIRE FILE" and just ignore the instruction. Filesystems that ignore the instruction include the following: - ZFS: If COMPRESSION and "file holes/gaps" are enabled on the filesystem, it will detect the zeroes and basically ignore the instruction since it's being told to write nothing (empty blocks). But in other cases it WILL write zeroes to disk: https://www.reddit.com/r/zfs/comments/6spiky/what_does_zfs_on_linux_do_with_fallocate_andor/ - BTRFS: By default, it uses copy-on-write and block de-duplication and compression. So all empty blocks (that just contain zero-bytes) will be hashed to the same block value ("an empty block") and therefore ignored. Instead of writing the empty blocks to disk, it will just keep track that those blocks are totally empty, so you will NOT get any write amplification on disk when writing empty blocks to files on BTRFS... BUT if you have manually disabled the BTRFS compression and de-duplication etc (maybe even if you disable Copy-on-Write on a file/folder), then it will write zeroes to disk, but most people don't disable that since that's the literal core of what makes BTRFS good. - XFS: I didn't research it so I will take your word that it also ignores empty blocks. - Other filesystems: It depends on the filesystem. I would highly suspect that Ext4 WILL WRITE zeroes to the disk, since it's an old and very basic filesystem that doesn't have much intelligence. I will now close this since the investigation is complete. It's clear that libtorrent doesn't try to allocate sparse files. It tells the OS to zero-fill the files, and it's the filesystem's job to ignore the worthless zeroes and avoid writing to the disk. The behavior differs on other operating systems, so don't take this as a universal law. I only investigated Linux, and only the filesystem I care about (BTRFS). Windows for example uses an entirely different Windows API to write the files instead. But overall, it's clear that libtorrent doesn't care about trying to do sparse allocation. From what I hear, sparse allocation is mostly achieved by opening a file descriptor, seeking to the end (target length), writing a zero at the end, and then the OS itself would have to deal with the "gap" (sparse allocation) of the file. Libtorrent doesn't do that at all. It tells the OS to zero-fill the entire file, at least on Linux. It's up to the filesystem to deal with it.
Author
Owner

@USBhost commented on GitHub (Feb 2, 2023):

I don't remember what are the default format settings for XFS are. But I can confirm it works in my setup.

However what I find interesting on XFS when you slowly start filling in that sparse file it will appear heavily fragmented until complete.

Edit: I think on ext4 fallocate works as expected.

@USBhost commented on GitHub (Feb 2, 2023): I don't remember what are the default format settings for XFS are. But I can confirm it works in my setup. However what I find interesting on XFS when you slowly start filling in that sparse file it will appear heavily fragmented until complete. Edit: I think on ext4 fallocate works as expected.
Author
Owner

@Arcitec commented on GitHub (Feb 6, 2023):

@USBhost I wasn't able to find the Ext4 answer. But since it's a very basic filesystem which doesn't feature compression, de-duplication, etc, it's almost guaranteed that it writes the zeroes to disk.

Also, it's important to not confuse the fallocate utility vs the posix_fallocate() system call.

  • fallocate: This is a utility which can do sparse allocation via a flag (-d aka --dig-holes). Sparse is basically where it writes a 0 at the start of the file, and then seeks to the end and writes another 0 there. Then it's up to the filesystem to handle the gap/sparse allocation. Ext4 supports sparse files, meaning that the fallocate utility works.
  • posix_fallocate(): Writes zeroes to disk, with the requested filesize (such as 10 GB for example). It's up to the filesystem to be smart and say "nope" and de-duplicate/compress those zeroes away, if you want to avoid writing a bunch of worthless zeroes to an SSD.

Libtorrent uses the latter method. It doesn't do sparse allocation (it doesn't seek in the files to make sparse gaps/holes). In other words, Libtorrent writes zeroes to the WHOLE file when it pre-allocates a file on disk.

As far as I am aware, Ext4 doesn't have any de-duplication or compression features, so Ext4 would be a filesystem that leads to a lot of pointless disk writes if you use Qbittorrent's pre-allocate option.

The others we've mentioned handle the zero-bytes gracefully and just ignore them, in the mentioned scenarios. :)

@Arcitec commented on GitHub (Feb 6, 2023): @USBhost I wasn't able to find the Ext4 answer. But since it's a very basic filesystem which doesn't feature compression, de-duplication, etc, it's almost guaranteed that it writes the zeroes to disk. Also, it's important to not confuse the `fallocate` utility vs the `posix_fallocate()` system call. - `fallocate`: This is a utility which can do sparse allocation via a flag (`-d` aka `--dig-holes`). Sparse is basically where it writes a 0 at the start of the file, and then seeks to the end and writes another 0 there. Then it's up to the filesystem to handle the gap/sparse allocation. Ext4 supports sparse files, meaning that the `fallocate` utility works. - `posix_fallocate()`: Writes zeroes to disk, with the requested filesize (such as 10 GB for example). It's up to the filesystem to be smart and say "nope" and de-duplicate/compress those zeroes away, if you want to avoid writing a bunch of worthless zeroes to an SSD. Libtorrent uses the latter method. It doesn't do sparse allocation (it doesn't seek in the files to make sparse gaps/holes). In other words, Libtorrent writes zeroes to the WHOLE file when it pre-allocates a file on disk. As far as I am aware, Ext4 doesn't have any de-duplication or compression features, so Ext4 would be a filesystem that leads to a lot of pointless disk writes if you use Qbittorrent's pre-allocate option. The others we've mentioned handle the zero-bytes gracefully and just ignore them, in the mentioned scenarios. :)
Author
Owner

@ZLima12 commented on GitHub (Feb 29, 2024):

Contrary to the theories here, I think that ext4's fallocate implementation does it the efficient way, not writing zeroes to disk. In my experience, running fallocate on an ext4 filesystem, even for very large files, is an almost instantaneous operation. This would not be possible if it wrote zeroes to disk.

Yes, ext4 is simpler than more modern filesystems, but that is actually a strength when it comes to things like fallocate. Since it's not copy-on-write, it can simply mark regions of the block device as reserved space for whatever file you allocated space for. There may or may not be a way to do something like this for your favorite CoW filesystem, but with ext4 there's less to worry about when implementing something like this.

@ZLima12 commented on GitHub (Feb 29, 2024): Contrary to the theories here, I think that ext4's `fallocate` implementation does it the efficient way, not writing zeroes to disk. In my experience, running `fallocate` on an ext4 filesystem, even for very large files, is an almost instantaneous operation. This would not be possible if it wrote zeroes to disk. Yes, ext4 is simpler than more modern filesystems, but that is actually a strength when it comes to things like `fallocate`. Since it's not copy-on-write, it can simply mark regions of the block device as reserved space for whatever file you allocated space for. There may or may not be a way to do something like this for your favorite CoW filesystem, but with ext4 there's less to worry about when implementing something like this.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/qBittorrent#13100
No description provided.