Videosort not reading special characters #102

Open
opened 2026-02-28 04:25:06 -05:00 by deekerman · 11 comments
Owner

Originally created by @sdburrows on GitHub (Apr 11, 2024).

Originally assigned to: @dnzbk on GitHub.

Is there already an issue for this request?

  • I have checked older issues, open and closed

Describe your issue

NZBGET version 23.1-testing-20240313
VideoSort version 10.1

There might be an issue with special character support i.e. the show Shōgun has been renamed as "Sh Gun".

Originally created by @sdburrows on GitHub (Apr 11, 2024). Originally assigned to: @dnzbk on GitHub. ### Is there already an issue for this request? - [X] I have checked older issues, open and closed ### Describe your issue NZBGET version 23.1-testing-20240313 VideoSort version 10.1 There might be an issue with special character support i.e. the show Shōgun has been renamed as "Sh Gun".
Author
Owner

@sdburrows commented on GitHub (Apr 16, 2024):

Additional log details:

Video Sort: 'charmap' codec can't encode character '\u014d' in position 26: character maps to <undefined>

@sdburrows commented on GitHub (Apr 16, 2024): Additional log details: `Video Sort: 'charmap' codec can't encode character '\u014d' in position 26: character maps to <undefined>`
Author
Owner

@dnzbk commented on GitHub (Apr 17, 2024):

Thanks for the report, but I wasn't able to reproduce the problem completely.
I got the exact error described in your logs which is not difficult to fix, but not the problem with the final file name after the processing.
Could you provide more info about your system, Python version, VideoSort settings?
Getting the NZB file you used would also be very helpful.

@dnzbk commented on GitHub (Apr 17, 2024): Thanks for the report, but I wasn't able to reproduce the problem completely. I got the exact error described in your logs which is not difficult to fix, but not the problem with the final file name after the processing. Could you provide more info about your system, Python version, VideoSort settings? Getting the NZB file you used would also be very helpful.
Author
Owner

@sdburrows commented on GitHub (Apr 17, 2024):

@dnzbk, I am on python v3.12.3, Videosort v10.2. The settings for series is:

%sn\Season %0s\%sn - S%0sE%0e - %en [%qss][%qf][%qrg]

NZB: https://www.transfernow.net/dl/20240417hL3xLYHS

Final folder output is:

Sh Gun 2024\Season 01\Sh Gun 2024 - S01E01 - Anjin [2160p][Web][playWEB]

@sdburrows commented on GitHub (Apr 17, 2024): @dnzbk, I am on python v3.12.3, Videosort v10.2. The settings for series is: `%sn\Season %0s\%sn - S%0sE%0e - %en [%qss][%qf][%qrg]` NZB: [https://www.transfernow.net/dl/20240417hL3xLYHS](url) Final folder output is: `Sh Gun 2024\Season 01\Sh Gun 2024 - S01E01 - Anjin [2160p][Web][playWEB]`
Author
Owner

@dnzbk commented on GitHub (Apr 18, 2024):

@sdburrows

Сouldn't confirm. This is what I got:
\complete\series\Shogun 2024\Season 01\Shogun 2024 - S01E01 - Anjin [2160p][Web][playWEB].mkv
I tested on Windows 11/macOS Ventura with NZBGet v23.1 and VideoSort 10.2.
I also added a new test case to check the spec. chars.

Is that the same NZB you used previously? I don't see any special chars like ō in the name.

I prepared a new version of VideoSort, in which I updated the guessit library to version 3.8 and fixed the bug you described in your logs. If you would be able to test it on your system that would be really great.
VideoSort 10.3.zip

@dnzbk commented on GitHub (Apr 18, 2024): @sdburrows Сouldn't confirm. This is what I got: `\complete\series\Shogun 2024\Season 01\Shogun 2024 - S01E01 - Anjin [2160p][Web][playWEB].mkv` I tested on Windows 11/macOS Ventura with NZBGet v23.1 and VideoSort 10.2. I also added a new [test case](https://github.com/nzbgetcom/Extension-VideoSort/blob/f566dd40eda65e2b63770e41e05391d5df57cf95/testdata.json#L196) to check the spec. chars. Is that the same NZB you used previously? I don't see any special chars like `ō` in the name. I prepared a new version of VideoSort, in which I updated the guessit library to version 3.8 and fixed the bug you described in your logs. If you would be able to test it on your system that would be really great. [VideoSort 10.3.zip](https://github.com/nzbgetcom/nzbget/files/15022400/VideoSort.zip)
Author
Owner

@sdburrows commented on GitHub (Apr 23, 2024):

@dnzbk

I have updated VideoSort to v10.3.

This is the latest error:

INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 60: character maps to <undefined>
-- | -- | --
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: return codecs.charmap_encode(input,self.errors,encoding_table)[0]
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 19, in encode
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: print('[INFO] Sorting inner files for earlier fake detection for %s' % NzbName)
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 380, in main
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: main()
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 421, in <module>
INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: Traceback (most recent call last):
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 60: character maps to <undefined>
-- | -- | --
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: return codecs.charmap_encode(input,self.errors,encoding_table)[0]
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 19, in encode
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: print('[INFO] Sorting inner files for earlier fake detection for %s' % NzbName)
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 380, in main
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: main()
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 421, in <module>
INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: Traceback (most recent call last):
INFO | Wed Apr 24 2024 07:34:42 | Successfully downloaded Shōgun.2024.S01E04.The.Eightfold.Fence.2160p.HULU.WEB-DL.DDP5.1.H.265-NTb\BxBQtP8TBQBJAcEcdoJdq.part63.rar
INFO | Wed Apr 24 2024 07:34:39 | Collection Shogun.2024.S01E10.2160p.WEB.H265-SUCCESSFULCRAB added to history
ERROR | Wed Apr 24 2024 07:34:39 | Post-process-script VideoSort for Shogun.2024.S01E10.2160p.WEB.H265-SUCCESSFULCRAB failed (terminated with unknown status)
INFO | Wed Apr 24 2024 07:34:39 | Video Sort: ModuleNotFoundError: No module named 'guessit'
INFO | Wed Apr 24 2024 07:34:39 | Video Sort: import guessit
INFO | Wed Apr 24 2024 07:34:39 | Video Sort: File "D:\NZBGet\Scripts\VideoSort\main.py", line 33, in <module>
INFO | Wed Apr 24 2024 07:34:39 | Video Sort: Traceback (most recent call last):

The final file was not renamed nor moved to its final destination. Also looks like there is a problem with Fake Detector too.

I am sharing two more NZBs that may be problematic. Apologies, I cant test it myself as v10.3 of Videosort is broken for me.

https://www.transfernow.net/dl/20240423Pm1BSvf2

@sdburrows commented on GitHub (Apr 23, 2024): @dnzbk I have updated VideoSort to v10.3. This is the latest error: ``` INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 60: character maps to <undefined> -- | -- | -- INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: return codecs.charmap_encode(input,self.errors,encoding_table)[0] INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 19, in encode INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: print('[INFO] Sorting inner files for earlier fake detection for %s' % NzbName) INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 380, in main INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: main() INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 421, in <module> INFO | Wed Apr 24 2024 07:34:45 | Fake Detector: Traceback (most recent call last): INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 60: character maps to <undefined> -- | -- | -- INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: return codecs.charmap_encode(input,self.errors,encoding_table)[0] INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: File "C:\Users\XXX\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 19, in encode INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: print('[INFO] Sorting inner files for earlier fake detection for %s' % NzbName) INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 380, in main INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: main() INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: File "D:\NZBGet\Scripts\FakeDetector\main.py", line 421, in <module> INFO | Wed Apr 24 2024 07:34:42 | Fake Detector: Traceback (most recent call last): INFO | Wed Apr 24 2024 07:34:42 | Successfully downloaded Shōgun.2024.S01E04.The.Eightfold.Fence.2160p.HULU.WEB-DL.DDP5.1.H.265-NTb\BxBQtP8TBQBJAcEcdoJdq.part63.rar INFO | Wed Apr 24 2024 07:34:39 | Collection Shogun.2024.S01E10.2160p.WEB.H265-SUCCESSFULCRAB added to history ERROR | Wed Apr 24 2024 07:34:39 | Post-process-script VideoSort for Shogun.2024.S01E10.2160p.WEB.H265-SUCCESSFULCRAB failed (terminated with unknown status) INFO | Wed Apr 24 2024 07:34:39 | Video Sort: ModuleNotFoundError: No module named 'guessit' INFO | Wed Apr 24 2024 07:34:39 | Video Sort: import guessit INFO | Wed Apr 24 2024 07:34:39 | Video Sort: File "D:\NZBGet\Scripts\VideoSort\main.py", line 33, in <module> INFO | Wed Apr 24 2024 07:34:39 | Video Sort: Traceback (most recent call last): ``` The final file was not renamed nor moved to its final destination. Also looks like there is a problem with Fake Detector too. I am sharing two more NZBs that may be problematic. Apologies, I cant test it myself as v10.3 of Videosort is broken for me. [https://www.transfernow.net/dl/20240423Pm1BSvf2](https://www.transfernow.net/dl/20240423Pm1BSvf2)
Author
Owner

@dnzbk commented on GitHub (Apr 25, 2024):

@sdburrows

I released new versions of VideoSort and FakeDetector. Feel free to update to the new versions via Extension Manager.
At least the UnicodeEncodeError error should go away. I hope the file name problem is gone too.
But again, I was unable to reproduce it, unfortunately.

@dnzbk commented on GitHub (Apr 25, 2024): @sdburrows I released new versions of VideoSort and FakeDetector. Feel free to update to the new versions via Extension Manager. At least the `UnicodeEncodeError` error should go away. I hope the file name problem is gone too. But again, I was unable to reproduce it, unfortunately.
Author
Owner

@sdburrows commented on GitHub (Apr 25, 2024):

@dnzbk, thanks for the update. I just tried it. Still the same, The special character is replaced with a space. But at least the errors are gone.

The file I used for testing was: https://www.transfernow.net/dl/20240425ZbbLKCBd

@sdburrows commented on GitHub (Apr 25, 2024): @dnzbk, thanks for the update. I just tried it. Still the same, The special character is replaced with a space. But at least the errors are gone. The file I used for testing was: [https://www.transfernow.net/dl/20240425ZbbLKCBd](https://www.transfernow.net/dl/20240425ZbbLKCBd)
Author
Owner

@sdburrows commented on GitHub (Sep 12, 2024):

I seem to be having issues with videosort not renaming files properly. Please find link to two NZBs. NZBget managed to download them but it does not rename the files nor move them to their final location but just get dumped within the Completed folder. I believe this issue may be related to special characters - one file has ":" and the other may have "'" in the filename.

There was no errors within the log.

Please take a look for me: Link

@sdburrows commented on GitHub (Sep 12, 2024): I seem to be having issues with videosort not renaming files properly. Please find link to two NZBs. NZBget managed to download them but it does not rename the files nor move them to their final location but just get dumped within the Completed folder. I believe this issue may be related to special characters - one file has ":" and the other may have "'" in the filename. There was no errors within the log. Please take a look for me: [Link](https://www.transfernow.net/dl/20240912fCuGfoNI)
Author
Owner

@dnzbk commented on GitHub (Sep 12, 2024):

I tested both NZBs and they worked fine for me. The results names didn't have any special symbols at all.
I used the default VideoSort settings. Could you tell me what your settings are so I can try them?

It seems like the problem might be related to your OS's locale settings.
What locale are you using? If it’s not English (United States) could you try using Unicode UTF-8?

  • Go to Settings -> Time & languages -> Administrative language settings
  • Click Change system local and turn on Unicode UTF-8 usage
@dnzbk commented on GitHub (Sep 12, 2024): I tested both NZBs and they worked fine for me. The results names didn't have any special symbols at all. I used the default VideoSort settings. Could you tell me what your settings are so I can try them? It seems like the problem might be related to your OS's locale settings. What locale are you using? If it’s not `English (United States)` could you try using `Unicode UTF-8`? - Go to Settings -> Time & languages -> Administrative language settings - Click Change system local and turn on Unicode UTF-8 usage
Author
Owner

@sdburrows commented on GitHub (Sep 13, 2024):

@dnzbk, I have sent you an email to Denis [@] nzbget [.] com

@sdburrows commented on GitHub (Sep 13, 2024): @dnzbk, I have sent you an email to Denis [@] nzbget [.] com
Author
Owner

@liv-mrr commented on GitHub (Oct 16, 2024):

NZBGET version 24.4-testing-20241009
VideoSort version 10.3
OS Windows 10

I have probably relative issue with special characters and OS's locale settings
Files are downloaded but aren't moved to final destination

@liv-mrr commented on GitHub (Oct 16, 2024): NZBGET version 24.4-testing-20241009 VideoSort version 10.3 OS Windows 10 I have probably relative issue with special characters and OS's locale settings Files are downloaded but aren't moved to final destination
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/nzbget#102
No description provided.