[Enhancement]: Direct MP3 Stream with server-side seeking (for ESPHome / Dumb Clients) #3200

Open
opened 2026-02-20 11:01:23 -05:00 by deekerman · 3 comments
Owner

Originally created by @Randalix on GitHub (Jan 24, 2026).

Type of Enhancement

Server Backend

Describe the Feature/Enhancement

I request a new query parameter for the existing /stream endpoint (e.g., ?forceDirect=true or ?simple=true) that bypasses HLS playlist generation.
When this parameter is present, the server should:
Accept the startTime parameter (in seconds).
Use FFmpeg to perform server-side seeking to that timestamp.
Transcode/Remux the audio on-the-fly to the requested format (e.g., MP3).
Pipe the output directly to the HTTP response body as a continuous stream (with Content-Type: audio/mpeg), instead of returning a JSON object or an m3u8 playlist.

Why would this be helpful?

This enhancement would open up Audiobookshelf to a wide range of low-power and DIY hardware devices that are currently incompatible with the platform's resume features.

Specifically:

Hardware Compatibility: Enables support for ESPHome (ESP32) media players, Squeezebox/LMS bridges, and older DLNA renderers that require a direct audio stream URL and cannot process HLS playlists or JSON responses.

Bandwidth Efficiency: Server-side seeking allows the client to start streaming exactly from the resume point, rather than downloading the entire file from the beginning just to discard the first 5 hours of audio.

Home Automation Integration: It makes Audiobookshelf the perfect backend for Home Assistant-based smart speakers, allowing users to seamlessly switch between a phone app and a physical speaker in the kitchen or bedroom without losing their place in the book.

Future Implementation (Screenshot)

I envision this as a lightweight extension of the existing transcoding engine. The server already handles transcoding logic; the difference here is the output target.

Proposed Logic:
API: Add an optional boolean query parameter (e.g., ?direct=true) to the /stream endpoint.
Controller: Inside the stream handler, if this flag is detected, bypass the HLS segmenter/manifest generation.
FFmpeg: Spawn an ffmpeg process using the startTime for seeking (passing -ss before the input -i for fast seeking).
Command context: ffmpeg -ss {startTime} -i {source} -vn -c:a libmp3lame -f mp3 pipe:1
Response: Set the Content-Type header to audio/mpeg (or corresponding format) and pipe the ffmpeg stdout directly to the HTTP response stream.
This approach ensures backward compatibility for all existing HLS clients while adding a "simple mode" for hardware players with minimal code changes.

Audiobookshelf Server Version

v2.32.1

Current Implementation (Screenshot)

No response

Originally created by @Randalix on GitHub (Jan 24, 2026). ### Type of Enhancement Server Backend ### Describe the Feature/Enhancement I request a new query parameter for the existing /stream endpoint (e.g., ?forceDirect=true or ?simple=true) that bypasses HLS playlist generation. When this parameter is present, the server should: Accept the startTime parameter (in seconds). Use FFmpeg to perform server-side seeking to that timestamp. Transcode/Remux the audio on-the-fly to the requested format (e.g., MP3). Pipe the output directly to the HTTP response body as a continuous stream (with Content-Type: audio/mpeg), instead of returning a JSON object or an m3u8 playlist. ### Why would this be helpful? This enhancement would open up Audiobookshelf to a wide range of low-power and DIY hardware devices that are currently incompatible with the platform's resume features. Specifically: Hardware Compatibility: Enables support for ESPHome (ESP32) media players, Squeezebox/LMS bridges, and older DLNA renderers that require a direct audio stream URL and cannot process HLS playlists or JSON responses. Bandwidth Efficiency: Server-side seeking allows the client to start streaming exactly from the resume point, rather than downloading the entire file from the beginning just to discard the first 5 hours of audio. Home Automation Integration: It makes Audiobookshelf the perfect backend for Home Assistant-based smart speakers, allowing users to seamlessly switch between a phone app and a physical speaker in the kitchen or bedroom without losing their place in the book. ### Future Implementation (Screenshot) I envision this as a lightweight extension of the existing transcoding engine. The server already handles transcoding logic; the difference here is the output target. Proposed Logic: API: Add an optional boolean query parameter (e.g., ?direct=true) to the /stream endpoint. Controller: Inside the stream handler, if this flag is detected, bypass the HLS segmenter/manifest generation. FFmpeg: Spawn an ffmpeg process using the startTime for seeking (passing -ss before the input -i for fast seeking). Command context: ffmpeg -ss {startTime} -i {source} -vn -c:a libmp3lame -f mp3 pipe:1 Response: Set the Content-Type header to audio/mpeg (or corresponding format) and pipe the ffmpeg stdout directly to the HTTP response stream. This approach ensures backward compatibility for all existing HLS clients while adding a "simple mode" for hardware players with minimal code changes. ### Audiobookshelf Server Version v2.32.1 ### Current Implementation (Screenshot) _No response_
Author
Owner

@nichwall commented on GitHub (Jan 24, 2026):

Is this the same as exposing the raw files directly like in an RSS feed? I believe people are already doing this.

https://github.com/advplyr/audiobookshelf/issues/891

@nichwall commented on GitHub (Jan 24, 2026): Is this the same as exposing the raw files directly like in an RSS feed? I believe people are already doing this. https://github.com/advplyr/audiobookshelf/issues/891
Author
Owner

@Randalix commented on GitHub (Jan 25, 2026):

It is actually different. The RSS feed exposes the raw files 'as is'. This works fine for smart podcast apps that can handle resume logic locally and support codecs like M4B or FLAC.

However, for simple hardware players (like ESP32/ESPHome, older DLNA renderers, or simple web players), we need Server-Side Resume and Transcoding.

Transcoding: My ESP32 only plays MP3. My library contains M4B and FLAC. I need the server to transcode on-the-fly.

Resume: Simple players cannot easily seek within a large remote file. I need the server to accept a ?startTime=123 parameter and stream the audio starting exactly from that second (using ffmpeg seeking), so the client receives a stream starting at t=0 but containing the content from t=123.

Concatenation: For multi-file audiobooks, simple players can't handle playlists well. The server should ideally stitch the files together into one continuous stream.

I implemented this logic via a Python middleware script for now, but having this natively in the /stream endpoint (e.g. ?forceDirect=true&seek=123) would be amazing!

@Randalix commented on GitHub (Jan 25, 2026): It is actually different. The RSS feed exposes the raw files 'as is'. This works fine for smart podcast apps that can handle resume logic locally and support codecs like M4B or FLAC. However, for simple hardware players (like ESP32/ESPHome, older DLNA renderers, or simple web players), we need Server-Side Resume and Transcoding. Transcoding: My ESP32 only plays MP3. My library contains M4B and FLAC. I need the server to transcode on-the-fly. Resume: Simple players cannot easily seek within a large remote file. I need the server to accept a ?startTime=123 parameter and stream the audio starting exactly from that second (using ffmpeg seeking), so the client receives a stream starting at t=0 but containing the content from t=123. Concatenation: For multi-file audiobooks, simple players can't handle playlists well. The server should ideally stitch the files together into one continuous stream. I implemented this logic via a Python middleware script for now, but having this natively in the /stream endpoint (e.g. ?forceDirect=true&seek=123) would be amazing!
Author
Owner

@bauermarkus commented on GitHub (Feb 2, 2026):

+1

@bauermarkus commented on GitHub (Feb 2, 2026): +1
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/audiobookshelf-advplyr#3200
No description provided.