YouTube video description breaks JSON parsing #15914

Open
opened 2026-02-21 08:35:57 -05:00 by deekerman · 2 comments
Owner

Originally created by @ealgase on GitHub (Feb 14, 2019).

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2019.02.08. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2019.02.08

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

(bionic)ealgase@localhost:~/USB/YouTubearchives$ youtube-dl -v Od39xwKZ1d0
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'Od39xwKZ1d0']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.01.30.1
[debug] Python version 3.6.7 (CPython) - Linux-3.14.0-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1
[debug] Proxy map: {}
[youtube] Od39xwKZ1d0: Downloading webpage
WARNING: [youtube] Od39xwKZ1d0: Failed to parse JSON Invalid \escape: line 1 column 98897 (char 98896)
[youtube] Od39xwKZ1d0: Downloading video info webpage
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://r6---sn-vgqsknel.googlevideo.com/videoplayback?ei=wwdmXILjFNDFwQHbnb3gDQ&clen=31890124&source=youtube&fvip=6&lmt=1536792839139822&expire=1550212131&pcm2=no&mime=video%2Fwebm&c=WEB&itag=303&key=yt6&ipbits=0&dur=290.683&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpcm2%2Cpl%2Crequiressl%2Csource%2Cexpire&requiressl=yes&id=o-AB-PnNf2R0KuhCyoy4ch-flN-NO-kue7Ks5xxMDwFmDW&ms=au%2Crdu&mt=1550190404&mv=m&gir=yes&pl=26&initcwndbps=1913750&signature=5B29175C0D3764D174A456AC0F2C2C678B6BE413.2C6BC4AD0D521C9ECC1D50C2B668F6466A5B6E5E&keepalive=yes&ip=2601%3A400%3Ac200%3Ac6ef%3Af18d%3Aa14f%3Ae313%3Abf36&mn=sn-vgqsknel%2Csn-vgqs7nlr&txp=5432332&mm=31%2C29&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278%2C298%2C299%2C302%2C303&ratebypass=yes'
[download] Destination: Emojis in Linux Terminal-Od39xwKZ1d0.f303.webm
[download] 100% of 30.41MiB in 00:04
[debug] Invoking downloader on 'https://r6---sn-vgqsknel.googlevideo.com/videoplayback?ei=wwdmXILjFNDFwQHbnb3gDQ&clen=4540185&source=youtube&fvip=6&lmt=1536795394737305&expire=1550212131&pcm2=no&mime=audio%2Fwebm&c=WEB&itag=251&ipbits=0&dur=290.701&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpcm2%2Cpl%2Crequiressl%2Csource%2Cexpire&requiressl=yes&id=o-AB-PnNf2R0KuhCyoy4ch-flN-NO-kue7Ks5xxMDwFmDW&ms=au%2Crdu&mt=1550190404&mv=m&gir=yes&pl=26&initcwndbps=1913750&signature=8DF19FD6D4F0D10E05526B21D1DA3FC9C6950379.DA4177550CDF240A3C2A40AD0303C557548F063C&keepalive=yes&ip=2601%3A400%3Ac200%3Ac6ef%3Af18d%3Aa14f%3Ae313%3Abf36&key=yt6&txp=5411222&mm=31%2C29&mn=sn-vgqsknel%2Csn-vgqs7nlr&ratebypass=yes'
[download] Destination: Emojis in Linux Terminal-Od39xwKZ1d0.f251.webm
[download] 100% of 4.33MiB in 00:00
[ffmpeg] Merging formats into "Emojis in Linux Terminal-Od39xwKZ1d0.webm"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:Emojis in Linux Terminal-Od39xwKZ1d0.f303.webm' -i 'file:Emojis in Linux Terminal-Od39xwKZ1d0.f251.webm' -c copy -map 0:v:0 -map 1:a:0 'file:Emojis in Linux Terminal-Od39xwKZ1d0.temp.webm'
Deleting original file Emojis in Linux Terminal-Od39xwKZ1d0.f303.webm (pass -k to keep)
Deleting original file Emojis in Linux Terminal-Od39xwKZ1d0.f251.webm (pass -k to keep)

Description of your issue, suggested solution and other information

I found out the issue that was causing #17940. The description of the video contains escape characters and breaks the JSON parsing. While it doesn't cause a failure in simply downloading the video, if it's deep in watch history, it will (as shown in #17940).

Originally created by @ealgase on GitHub (Feb 14, 2019). ## Please follow the guide below - You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly - Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`) - Use the *Preview* tab to see what your issue will actually look like --- ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.02.08*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. - [x] I've **verified** and **I assure** that I'm running youtube-dl **2019.02.08** ### Before submitting an *issue* make sure you have: - [x] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [x] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones - [x] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser ### What is the purpose of your *issue*? - [x] Bug report (encountered problems with youtube-dl) - [ ] Site support request (request for adding support for a new site) - [ ] Feature request (request for a new functionality) - [ ] Question - [ ] Other --- ### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue* --- ### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows: Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```): ``` (bionic)ealgase@localhost:~/USB/YouTubearchives$ youtube-dl -v Od39xwKZ1d0 [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['-v', 'Od39xwKZ1d0'] [debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2019.01.30.1 [debug] Python version 3.6.7 (CPython) - Linux-3.14.0-x86_64-with-Ubuntu-18.04-bionic [debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1 [debug] Proxy map: {} [youtube] Od39xwKZ1d0: Downloading webpage WARNING: [youtube] Od39xwKZ1d0: Failed to parse JSON Invalid \escape: line 1 column 98897 (char 98896) [youtube] Od39xwKZ1d0: Downloading video info webpage [debug] Default format spec: bestvideo+bestaudio/best [debug] Invoking downloader on 'https://r6---sn-vgqsknel.googlevideo.com/videoplayback?ei=wwdmXILjFNDFwQHbnb3gDQ&clen=31890124&source=youtube&fvip=6&lmt=1536792839139822&expire=1550212131&pcm2=no&mime=video%2Fwebm&c=WEB&itag=303&key=yt6&ipbits=0&dur=290.683&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpcm2%2Cpl%2Crequiressl%2Csource%2Cexpire&requiressl=yes&id=o-AB-PnNf2R0KuhCyoy4ch-flN-NO-kue7Ks5xxMDwFmDW&ms=au%2Crdu&mt=1550190404&mv=m&gir=yes&pl=26&initcwndbps=1913750&signature=5B29175C0D3764D174A456AC0F2C2C678B6BE413.2C6BC4AD0D521C9ECC1D50C2B668F6466A5B6E5E&keepalive=yes&ip=2601%3A400%3Ac200%3Ac6ef%3Af18d%3Aa14f%3Ae313%3Abf36&mn=sn-vgqsknel%2Csn-vgqs7nlr&txp=5432332&mm=31%2C29&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278%2C298%2C299%2C302%2C303&ratebypass=yes' [download] Destination: Emojis in Linux Terminal-Od39xwKZ1d0.f303.webm [download] 100% of 30.41MiB in 00:04 [debug] Invoking downloader on 'https://r6---sn-vgqsknel.googlevideo.com/videoplayback?ei=wwdmXILjFNDFwQHbnb3gDQ&clen=4540185&source=youtube&fvip=6&lmt=1536795394737305&expire=1550212131&pcm2=no&mime=audio%2Fwebm&c=WEB&itag=251&ipbits=0&dur=290.701&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpcm2%2Cpl%2Crequiressl%2Csource%2Cexpire&requiressl=yes&id=o-AB-PnNf2R0KuhCyoy4ch-flN-NO-kue7Ks5xxMDwFmDW&ms=au%2Crdu&mt=1550190404&mv=m&gir=yes&pl=26&initcwndbps=1913750&signature=8DF19FD6D4F0D10E05526B21D1DA3FC9C6950379.DA4177550CDF240A3C2A40AD0303C557548F063C&keepalive=yes&ip=2601%3A400%3Ac200%3Ac6ef%3Af18d%3Aa14f%3Ae313%3Abf36&key=yt6&txp=5411222&mm=31%2C29&mn=sn-vgqsknel%2Csn-vgqs7nlr&ratebypass=yes' [download] Destination: Emojis in Linux Terminal-Od39xwKZ1d0.f251.webm [download] 100% of 4.33MiB in 00:00 [ffmpeg] Merging formats into "Emojis in Linux Terminal-Od39xwKZ1d0.webm" [debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:Emojis in Linux Terminal-Od39xwKZ1d0.f303.webm' -i 'file:Emojis in Linux Terminal-Od39xwKZ1d0.f251.webm' -c copy -map 0:v:0 -map 1:a:0 'file:Emojis in Linux Terminal-Od39xwKZ1d0.temp.webm' Deleting original file Emojis in Linux Terminal-Od39xwKZ1d0.f303.webm (pass -k to keep) Deleting original file Emojis in Linux Terminal-Od39xwKZ1d0.f251.webm (pass -k to keep) ``` --- ### Description of your *issue*, suggested solution and other information I found out the issue that was causing #17940. The description of the video contains escape characters and breaks the JSON parsing. While it doesn't cause a failure in simply downloading the video, if it's deep in watch history, it will (as shown in #17940).
Author
Owner

@NathanJewell commented on GitHub (Mar 18, 2019):

I performed some review of this issue. I verified the given case and am able to make a stronger conclusion about the source.

The JSON parsing at question is in extractor/Common.py _parse_json()

After analyzing the json string I can say the following.
-The issue is due to the plaintext unicode escape strings in the video description
-When passed to the parsing function there is an extra escape character in the sequence
-I believe this may be due to an encoding error which is not the fault of youtube-dl

The only resolution to this in my mind would be to manually verify all incoming json for erroneous escape sequences which has performance implications and is not necessary in the majority of cases.

I will continue to check out the issue and see if I can confirm the original source of the encoding error.

@NathanJewell commented on GitHub (Mar 18, 2019): I performed some review of this issue. I verified the given case and am able to make a stronger conclusion about the source. The JSON parsing at question is in extractor/Common.py _parse_json() After analyzing the json string I can say the following. -The issue is due to the plaintext unicode escape strings in the video description -When passed to the parsing function there is an extra escape character in the sequence -I believe this may be due to an encoding error which is not the fault of youtube-dl The only resolution to this in my mind would be to manually verify all incoming json for erroneous escape sequences which has performance implications and is not necessary in the majority of cases. I will continue to check out the issue and see if I can confirm the original source of the encoding error.
Author
Owner

@ealgase commented on GitHub (Mar 18, 2019):

Thanks for your help!

@ealgase commented on GitHub (Mar 18, 2019): Thanks for your help!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/youtube-dl-ytdl-org#15914
No description provided.