Invalid \escape error while trying to download a youtube page #1877

Closed
opened 2026-02-20 22:18:47 -05:00 by deekerman · 2 comments
Owner

Originally created by @vis15 on GitHub (Feb 9, 2014).

command:
youtube-dl -v http://www.youtube.com/channel/UCjbfdFm4786wi6hFXjDKxSA/videos

ouput that I get:

[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'http://www.youtube.com/channel/UCjbfdFm4786wi6hFXjDKxSA/videos']
[debug] Encodings: locale 'UTF-8', fs 'UTF-8', out 'UTF-8', pref: 'UTF-8'
[debug] youtube-dl version 2014.02.08.2
[debug] Python version 2.7.5+ - Linux-3.11.0-15-generic-x86_64-with-LinuxMint-16-petra
[debug] Proxy map: {}
[youtube:channel] UCjbfdFm4786wi6hFXjDKxSA: Downloading webpage
[youtube:channel] UCjbfdFm4786wi6hFXjDKxSA: Downloading page #1
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "../youtube-dl/main.py", line 18, in
File "../youtube-dl/youtube_dl/init.py", line 800, in main
File "../youtube-dl/youtube_dl/init.py", line 790, in _real_main
File "../youtube-dl/youtube_dl/YoutubeDL.py", line 982, in download
File "../youtube-dl/youtube_dl/YoutubeDL.py", line 493, in extract_info
File "../youtube-dl/youtube_dl/extractor/common.py", line 158, in extract
File "../youtube-dl/youtube_dl/extractor/youtube.py", line 1596, in _real_extract
File "/usr/lib/python2.7/json/init.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Invalid \escape: line 1 column 40052 (char 40051)

Originally created by @vis15 on GitHub (Feb 9, 2014). command: youtube-dl -v http://www.youtube.com/channel/UCjbfdFm4786wi6hFXjDKxSA/videos ouput that I get: [debug] System config: [] [debug] User config: [] [debug] Command-line args: ['-v', 'http://www.youtube.com/channel/UCjbfdFm4786wi6hFXjDKxSA/videos'] [debug] Encodings: locale 'UTF-8', fs 'UTF-8', out 'UTF-8', pref: 'UTF-8' [debug] youtube-dl version 2014.02.08.2 [debug] Python version 2.7.5+ - Linux-3.11.0-15-generic-x86_64-with-LinuxMint-16-petra [debug] Proxy map: {} [youtube:channel] UCjbfdFm4786wi6hFXjDKxSA: Downloading webpage [youtube:channel] UCjbfdFm4786wi6hFXjDKxSA: Downloading page #1 Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "../youtube-dl/__main__.py", line 18, in <module> File "../youtube-dl/youtube_dl/**init**.py", line 800, in main File "../youtube-dl/youtube_dl/**init**.py", line 790, in _real_main File "../youtube-dl/youtube_dl/YoutubeDL.py", line 982, in download File "../youtube-dl/youtube_dl/YoutubeDL.py", line 493, in extract_info File "../youtube-dl/youtube_dl/extractor/common.py", line 158, in extract File "../youtube-dl/youtube_dl/extractor/youtube.py", line 1596, in _real_extract File "/usr/lib/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib/python2.7/json/decoder.py", line 365, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.7/json/decoder.py", line 381, in raw_decode obj, end = self.scan_once(s, idx) ValueError: Invalid \escape: line 1 column 40052 (char 40051)
Author
Owner

@stwl5 commented on GitHub (Feb 9, 2014):

I got exactly the same error today and found out that a beginning uppercase (i.e "\U") instead of lowercase "\u" in a unicode-escape string causes this error, because the json decoder lib of python (my python version is also 2.7.x) only scans for the lowercase variant. A solution was, that I replaced page = json.loads(page) with page = json.loads(page.replace('\\U','\\u')) in file "youtube_dl/extractor/youtube.py" on line 1596. Maybe the same replace should be made on the other information extractors inside youtube.py because I only tried it with "/videos" pages. I also tried the url you posted above, it also worked after the mentioned fix. :)

@stwl5 commented on GitHub (Feb 9, 2014): I got exactly the same error today and found out that a beginning uppercase (i.e "\U") instead of lowercase "\u" in a unicode-escape string causes this error, because the json decoder lib of python (my python version is also 2.7.x) only scans for the lowercase variant. A solution was, that I replaced `page = json.loads(page)` with `page = json.loads(page.replace('\\U','\\u'))` in file "youtube_dl/extractor/youtube.py" on line 1596. Maybe the same replace should be made on the other information extractors inside youtube.py because I only tried it with "/videos" pages. I also tried the url you posted above, it also worked after the mentioned fix. :)
Author
Owner

@phihag commented on GitHub (Feb 9, 2014):

Thank you for the report, this will be fixed in the next version. Since I'm currently mobile with a very spotty connection, the release may take some time, but should be out within 10 hours.

Simply replacing \U with \u will not work - YouTube intends to describe a character by the codepoint with 8 hexadecimal characters. In contrast, \u only expects 4 hexadecimal characters, so the result after simply replacing will be incorrect and contain superfluous digits.

@phihag commented on GitHub (Feb 9, 2014): Thank you for the report, this will be fixed in the next version. Since I'm currently mobile with a very spotty connection, the release may take some time, but should be out within 10 hours. Simply replacing `\U` with `\u` will not work - YouTube intends to describe a character by the codepoint with 8 hexadecimal characters. In contrast, `\u` only expects 4 hexadecimal characters, so the result after simply replacing will be incorrect and contain superfluous digits.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/youtube-dl#1877
No description provided.