mirror of
https://github.com/qbittorrent/qBittorrent.git
synced 2026-03-02 22:57:32 -05:00
encoding issue in directory name with Cyrillic letter й #14935
Labels
No labels
Accessibility
AppImage
Bounty
Build system
CI
Can't reproduce
Code cleanup
Confirmed bug
Confirmed bug
Core
Crash
Data loss
Discussion
Docker
Documentation
Duplicate
Feature
Feature request
Feature request
Feature request
Filters
Flatpak
GUI
Has workaround
I2P
Invalid
Libtorrent
Look and feel
Meta
NSIS
Network
Not an issue
OS: *BSD
OS: Linux
OS: Windows
OS: macOS
PPA
Performance
Project management
Proxy/VPN
Qt bugs
Qt6 compat
RSS
Search engine
Security
Temp folder
Themes
Translations
Triggers
Waiting diagnosis
Waiting info
Waiting upstream
Waiting web implementation
Watched folders
WebAPI
WebUI
autoCloseOldIssue
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/qBittorrent#14935
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kambala-decapitator on GitHub (Sep 2, 2023).
qBittorrent & operating system versions
qBittorrent: 4.5.5 x64
Operating system: Windows 10 Pro 22H2 (10.0.19045) x64
Qt: 5.15.10
libtorrent-rasterbar: 1.2.19
What is the problem?
I have a torrent that contains a directory named
Мод + русификатор одним файлом. I already have all the files on disk, so I checked "skip hash check" when adding the torrent. However, I received message that not all files are present on disk, so I checked hash and started download, also noted which files were absent, as that was strange. After finishing download I went to the directory to check the files and got quite surprised: I ended up with 2 identically named directories! Such stuff should definitely be impossible normally, so I went to Powershell to check directory contents there, and then it got clear (pasting with text doesn't make the issue clear):The strangely-looking filename was created today (see the date) by qBt. Originally I created this directory manually as I'm the author if this torrent, I did it in macOS. The disk is exFAT if it matters.
Steps to reproduce
Additional context
Looks like the issue is the letter
йbecause it can be forged in 2 ways: the letter itself or using https://en.wikipedia.org/wiki/Combining_character Unicode feature.Also noticed the same issue in another folder of the same torrent:
Never had such issue on macOS either with qBt or Transmission.
I tried creating directory with й letter using Explorer & Total Commander and they use the letter itself, not the combined version.
It's also strange that on the above SS the freshly created directory is listed in the very end instead of alphabetical order.
My torrent: Median XL 2017 + Legacy MXL mods.zip
Looks related to #12506
Log(s) & preferences file(s)
No response
@glassez commented on GitHub (Sep 2, 2023):
So what does qBittorrent have to do with it? It just creates files with exactly the same names as specified in the torrent file.
This is simply an incompatibility of the ways in which some characters are encoded between different operating systems. (If there are no problems somewhere, then the guys from Apple will definitely come up with something.)
@kambala-decapitator commented on GitHub (Sep 2, 2023):
the torrent was created by Transmission/2.94 as I see in the header. I can't tell you whether it's some specific Transmission behavior/bug or macOS or "guys from Apple" or even Windows issue or w/e else, but saying "we don't care about compatibility" doesn't sound like a healthy approach, especially since qBt on macOS doesn't have this issue.
I don't think it's a big deal to perform unicode normalization on file names (which kinda makes sense I'd say).
@glassez commented on GitHub (Sep 2, 2023):
qBt has no issue according to your report. What "compatibility" do you talk about? There is only incompatibility between operating systems so they create different characters when you type
йon your keyboard. For qBt, Transmission etc. they are different file names.@glassez commented on GitHub (Sep 2, 2023):
What kind of "unicode normalization" are you talking about? Do you mean some well-known method? Or do you want to offer your own algorithm?
@glassez commented on GitHub (Sep 2, 2023):
I really believe it is "bug" to represent Russian letter
йwith two characters since such a way is intended to add diacritics to the letters butйis independent letter in Russian (notиwith diacritic mark˘).@kambala-decapitator commented on GitHub (Sep 2, 2023):
qBt behaves inconsistently on macOS and Windows. Why do you think it's not a bug? If macOS version of qBt had created me this duplicate folder with combined й character just like the Windows version has done, the behvaior would have been consistent (albeit unexpected). Maybe it's a Qt bug, I don't know.
yes, there're standard algorithms. For example,
NSStringfrom macOS SDK can utilize it. https://unicode.org/reports/tr15/it might be a bug in Transmission that it produces 2 characters instead of one. But the process itself is not a bug, it's a feature of Unicode, please see the link given above.
@glassez commented on GitHub (Sep 2, 2023):
According to OP, qBittorrent doesn't create two folders, it creates the single one, the second one does already exist. So why I should think it is a bug? Of course, I have not seen the contents of the .torrent file in question, but it is unlikely that qBittorrent replaces the
йcharacter contained there with a combination ofиand˘.You don't have duplicate folder on macOS since the names of files on disk are exactly the same as the ones in .torrent file. I suppose you could safely delete on Windows the folder with mismatched name.
@glassez commented on GitHub (Sep 2, 2023):
Does macOS use Unicode currently? If so, Transmission should just get filenames from OS as-is.
@luzpaz commented on GitHub (Sep 17, 2023):
@kambala-decapitator any follow-up ?
@kambala-decapitator commented on GitHub (Oct 7, 2023):
it's provided in the OP
yes, macOS uses Unicode
Made some tests with qBt and Transmission on macOS and Windows. Looks like the main difference is the processing of filesystem paths inside an app: qBt on both platforms uses Qt as well as Transmission for Windows, but Transmission for macOS uses native macOS SDK (Foundation framework I believe). So, any qBt and Transmission for Windows always create torrent file with a single
й, but the latest Transmission for macOS creates torrent with the character decomposed into 2. Since even Transmission for different platforms is incompatible with each other, I'd rather view this is as Transmission bug.This is how torrent created in Transmission for macOS looks in Transmission for Windows:

I also noticed that even terminal output of
lsunder macOS uses decomposed character:maybe that's simply how NSString (or probably filesystem driver itself) stores unicode characters under the hood.
but anyway, it'd be great to have such compatibility between qBt for non-macOS and Transmission for macOS.
test torrent files for reference: torrents.zip
@luzpaz commented on GitHub (Aug 14, 2024):
@glassez 👆