mirror of
https://github.com/advplyr/audiobookshelf.git
synced 2026-03-02 22:46:55 -05:00
[Enhancement]: Direct MP3 Stream with server-side seeking (for ESPHome / Dumb Clients) #3200
Labels
No labels
authentication
awaiting release
backlog
bug
chapter editor
config-issue
ebooks
encoding/embedding
enhancement
help wanted
listening sessions & progress
planned
possible plugin
progress sync
sorting/filtering/searching
unable to reproduce
upload
users & permissions
waiting
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/audiobookshelf-advplyr#3200
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Randalix on GitHub (Jan 24, 2026).
Type of Enhancement
Server Backend
Describe the Feature/Enhancement
I request a new query parameter for the existing /stream endpoint (e.g., ?forceDirect=true or ?simple=true) that bypasses HLS playlist generation.
When this parameter is present, the server should:
Accept the startTime parameter (in seconds).
Use FFmpeg to perform server-side seeking to that timestamp.
Transcode/Remux the audio on-the-fly to the requested format (e.g., MP3).
Pipe the output directly to the HTTP response body as a continuous stream (with Content-Type: audio/mpeg), instead of returning a JSON object or an m3u8 playlist.
Why would this be helpful?
This enhancement would open up Audiobookshelf to a wide range of low-power and DIY hardware devices that are currently incompatible with the platform's resume features.
Specifically:
Hardware Compatibility: Enables support for ESPHome (ESP32) media players, Squeezebox/LMS bridges, and older DLNA renderers that require a direct audio stream URL and cannot process HLS playlists or JSON responses.
Bandwidth Efficiency: Server-side seeking allows the client to start streaming exactly from the resume point, rather than downloading the entire file from the beginning just to discard the first 5 hours of audio.
Home Automation Integration: It makes Audiobookshelf the perfect backend for Home Assistant-based smart speakers, allowing users to seamlessly switch between a phone app and a physical speaker in the kitchen or bedroom without losing their place in the book.
Future Implementation (Screenshot)
I envision this as a lightweight extension of the existing transcoding engine. The server already handles transcoding logic; the difference here is the output target.
Proposed Logic:
API: Add an optional boolean query parameter (e.g., ?direct=true) to the /stream endpoint.
Controller: Inside the stream handler, if this flag is detected, bypass the HLS segmenter/manifest generation.
FFmpeg: Spawn an ffmpeg process using the startTime for seeking (passing -ss before the input -i for fast seeking).
Command context: ffmpeg -ss {startTime} -i {source} -vn -c:a libmp3lame -f mp3 pipe:1
Response: Set the Content-Type header to audio/mpeg (or corresponding format) and pipe the ffmpeg stdout directly to the HTTP response stream.
This approach ensures backward compatibility for all existing HLS clients while adding a "simple mode" for hardware players with minimal code changes.
Audiobookshelf Server Version
v2.32.1
Current Implementation (Screenshot)
No response
@nichwall commented on GitHub (Jan 24, 2026):
Is this the same as exposing the raw files directly like in an RSS feed? I believe people are already doing this.
https://github.com/advplyr/audiobookshelf/issues/891
@Randalix commented on GitHub (Jan 25, 2026):
It is actually different. The RSS feed exposes the raw files 'as is'. This works fine for smart podcast apps that can handle resume logic locally and support codecs like M4B or FLAC.
However, for simple hardware players (like ESP32/ESPHome, older DLNA renderers, or simple web players), we need Server-Side Resume and Transcoding.
Transcoding: My ESP32 only plays MP3. My library contains M4B and FLAC. I need the server to transcode on-the-fly.
Resume: Simple players cannot easily seek within a large remote file. I need the server to accept a ?startTime=123 parameter and stream the audio starting exactly from that second (using ffmpeg seeking), so the client receives a stream starting at t=0 but containing the content from t=123.
Concatenation: For multi-file audiobooks, simple players can't handle playlists well. The server should ideally stitch the files together into one continuous stream.
I implemented this logic via a Python middleware script for now, but having this natively in the /stream endpoint (e.g. ?forceDirect=true&seek=123) would be amazing!
@bauermarkus commented on GitHub (Feb 2, 2026):
+1