mirror of
https://github.com/louislam/dockge.git
synced 2026-03-03 02:06:55 -05:00
Upon restart some stacks never start #169
Labels
No labels
bug
feature-request
help
help wanted
invalid-format
need-reproduce-steps
question
security
upstream
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/dockge-louislam#169
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @liquidfrollo on GitHub (Oct 23, 2024).
⚠️ Please verify that this bug has NOT been reported before.
🛡️ Security Policy
Description
When the host is restarted (Truenas scale running jlmkr / dockge) some stacks start but others show exited or not up completely. If i go into the stacks manually and click start they start without issue.
👟 Reproduction steps
Restart the host
👀 Expected behavior
all stacks should start without issue
😓 Actual Behavior
some stacks do not start. Speculation that the docker sock exposed by traefik is not available and is required by other stacks, no way to do "depends on" between stacks.
Dockge Version
1.4.2
💻 Operating System and Arch
Truenas Scale (24.0.4.2.3) / Jlmkr running - Debian release 12 codename bookworm
🌐 Browser
Firefox - most current
🐋 Docker Version
20.10.24+dfsg1
🟩 NodeJS Version
No response
📝 Relevant log output
No response
@wsw70 commented on GitHub (Oct 26, 2024):
I am not sure I understand: the docker socket is provided by docker (and the OS) and Traefik just makes use of it.
@louislam commented on GitHub (Oct 26, 2024):
Might be the wrong status bug, which I still don't know how to 100% reproduce it. They maybe actually up.
@liquidfrollo commented on GitHub (Oct 27, 2024):
The apps themselves are NOT up. Take the arr's stack above where some say started and some are not. The overseerr app is up however sonarr / radarr is not and returns a 404 when i attempt to access. I wish i could stack rank or priority order start the compose files at a minimum (assuming i couldn't depend on another app in a diff stack) to buy time for the picky stacks such as my arr's stack.
With respect to the docker sock. from a security perspective, the docker sock is not directly exposed and instead is exposed through the socket-proxy service seen in the traefik stack. I tried making other stacks depend on the status of another however that doesn't work because you can't make one compose depend on another (from what i gathered)
@ghost commented on GitHub (Nov 2, 2024):
I'm having the same issue , but I am new to docker so it may be that.
I have linkding , readeck and uptime-kuma setup and working.
However readeck doesn't start , or perhaps does start and fails.
If I start it manually it works perfectly.
@liquidfrollo commented on GitHub (Nov 4, 2024):
same for me in that if i restart it manually it works. It just never fully loads on a server restart.
@ghost commented on GitHub (Nov 4, 2024):
I created a new ProxMox LXC , installed Docker - just the command line version this time.
Copied the data directory across.
This starts Readeck on Host restart every time so far.
@liquidfrollo commented on GitHub (Nov 6, 2024):
Glad your issue is resolved however, mine is not and i have no indication why those containers won't start without manual intervention
@InnocentRain commented on GitHub (Dec 16, 2024):
I pretty much have the same issue, some containers don't start after a reboot, but only about 20% of the time.
@N0rga commented on GitHub (Dec 27, 2024):
I'm having the same issue as @liquidfrollo. Any containers within Dockge don't automatically start when my server is rebooted. They're all in individual stacks as well, with ARR apps pointing back to Gluetun for VPN/Network. (Understand that this is probably not the ideal configuration)
@liquidfrollo commented on GitHub (Jan 15, 2025):
I'm wondering if it is quietly failing because there is no health check / dependency ability between stacks? Just speculating as i'm unsure why it wouldn't just come up as healthy. Also curious if we could set a priority of compose start order if it would resolve it. For instance if i start / wait for traefik stack and authentik stack (reverse proxy + socket security service, and authentication service) would everything else start without issue?
@N0rga commented on GitHub (Jan 15, 2025):
Yeah, it seemed to me that when the container was destroyed and then recreated, it was given another ID or something. I also had this when certain stacks were updated through Dockge.
As they were using the Gluetun stack as the network mode, they for some reason couldn't find it any more, and each time trying to start said container "whole string of numbers and letter" could not be found or doesn't exist.
I had the issue before and couldn't fix it that time and has to recreate the stack again and it worked no problem.
I fixed my issues by putting all of GlueTun and the *arr's into a single stack and using the Depends_on and healthcheck commands to make all of the *arrs wait until the Gluetun service was healthy before starting.
@liquidfrollo commented on GitHub (Jan 15, 2025):
curiously enough gluetun / qbit is also a stack that doesn't start! Same
with plex which doesn't have a dependency on any of them.
On Wed, Jan 15, 2025 at 1:08 PM N0rga @.***> wrote:
@DomiiBunn commented on GitHub (Jan 27, 2025):
Hia all,
Kinda late to the party.
What is the restart_policy on all of the offending container stacks?
Dockge does not auto restart the containers - docker it's self does depenging on the selected option.
Start containers automatically
By default docker won't auto restart containers. You can change this behavoiur by setting the policy on all containers to
Unless Stoppedor
AlwaysBe sure to set the option on all the containers in the stack
@liquidfrollo commented on GitHub (Jan 27, 2025):
All offending stacks have "restart: unless-stopped" set on them yet still do not start. Thanks for the double-check!
@DomiiBunn commented on GitHub (Jan 27, 2025):
In that case could you please paste your compose file?
Still the issue woudn't likley be with Dockge but it's worth a poke
@liquidfrollo commented on GitHub (Jan 27, 2025):
here is one of them. There are 4 that don't start with the same behavior. If i go manually click start they all work. Note attempted to use a code block but it removes all new lines so is hard to read
version: "3.8"
services:
plex:
image: plexinc/pms-docker:plexpass
restart: unless-stopped
container_name: plexms
ports:
- 32400:32400/tcp
- 3005:3005/tcp
- 8324:8324/tcp
- 32469:32459/tcp
- 1900:1900/udp
- 32410:32410/udp
- 32412:32412/udp
- 32413:32413/udp
- 32414:32414/udp
environment:
- PUID=1000
- PGID=1000
- TZ=America/Denver
- PLEX_CLAIM=${PLEX_CLAIM}
- HOSTNAME="XXXX"
volumes:
- ./config:/config
- ./transcodes:/transcode
- ${NAS_DIR}/Anime:/Anime
- ${NAS_DIR}/TVShows:/TVShows
- ${NAS_DIR}/Videos:/Videos
- ${NAS_DIR}/Audiobooks:/Audiobooks
networks:
- proxy
labels:
- traefik.enable=true
- traefik.http.routers.plex.entryPoints=https
- traefik.http.routers.plex.rule=Host(
XXX) ||HostRegexp(
{subdomain:[A-Za-z0-9](?:[A-Za-z0-9\-]{0,61}[A-Za-z0-9])?}XXXX)&& PathPrefix(
/outpost.goauthentik.io/)- traefik.http.routers.plex.tls.certresolver=cloudflare
- traefik.http.services.plex.loadbalancer.server.port=32400
- traefik.frontend.headers.SSLRedirect=true
- traefik.frontend.headers.STSSeconds=315360000
- traefik.frontend.headers.browserXSSFilter=true
- traefik.frontend.headers.contentTypeNosniff=true
- traefik.frontend.headers.forceSTSHeader=true
- traefik.frontend.headers.SSLHost=XXXX
- traefik.frontend.headers.STSIncludeSubdomains=true
- traefik.frontend.headers.STSPreload=true
- traefik.frontend.headers.frameDeny=true
networks:
proxy:
external: true
@InnocentRain commented on GitHub (Jan 31, 2025):
Here's one of mine that gives me the most trouble, most of the time it works just fine but sometimes one or more containers fail to start:
@DomiiBunn commented on GitHub (Feb 3, 2025):
I can not reproduce this at all.
I tried on a standalone instance.
On a VM
bare metal
in a proxmox ct
nothing seams to cause said issue.
Last question I would have.
Does the issue happen when you start the stack by using your docker compose instead of via dockge.
@liquidfrollo commented on GitHub (Feb 3, 2025):
Dockge is installed directly as the only thing on jlmkr on truenas scale.
I can start all compose files manually without issue. Additionally, i can
start them manually via dockge without issue. For whatever reason it just
won't auto start when the host comes back up.
On Mon, Feb 3, 2025 at 1:21 PM Dominika Jadowska @.***>
wrote:
@InnocentRain commented on GitHub (Feb 4, 2025):
same with me but dockge is installed on a bare metal debian
@DomiiBunn commented on GitHub (Feb 8, 2025):
Unless someone smarter than me says otherwise. This is not dockge related.
Or if we can get reproducible steps that prove it's dockge and not the docker host it's self
@liquidfrollo commented on GitHub (Feb 11, 2025):
DomiiBunn, any suggestions on how i could capture logs / repeatable steps etc that would expose if it is dockge or not? It looks like others in this thread with dockge experience the same issue. It seems to me as if some internal health check isn't being respected or something. Since i am unaware on any ability to control the sequence of how the containers start or make dependencies cross stack i don't really have the ability to control any starting functions. I don't want to combine stacks due to core dependency vs common apps. For instance, docker socket needed by most vs controlling all arr apps. I use that as an example of something i would like to do but don't know how to do yet due to dependencies across stacks. I would like to use a socket proxy vs exposing the socket directly for security but since i don't have cross stack dependency i haven't implemented this yet. If i were able to do this since there are health checks on the docker socket being avail it would functionally delay all other stacks potentially preventing this issue.
@Briantist9 commented on GitHub (Apr 16, 2025):
Total docker noob here. But I was able to resolve the issue of gluetun and qbit not auto starting in Dockge running in TrueNAS 25.04.0.
I'm not sure exactly what did it. I stopped & inactived the two containers and moved the "restart: unless-stopped" line higher up in the compose.yaml in edit mode. I put it before "volumes:"
I started the containers up again and then rebooted TrueNAS and all started up automatically.
@InnocentRain commented on GitHub (Apr 17, 2025):
i have mine at third place, right after image and container_name
@liquidfrollo commented on GitHub (Apr 17, 2025):
I think my issue is largely resolved after i force updated the OS AND docker. Running apt-get update && apt-get upgrade would not upgrade the docker version. The version on the ubuntu version of my container host could not see a newer version. I had to manually add a repo and force the upgrade of docker. Once that happened the issue largely went away. It seems that when it occurs it is more of a timing issue. That said either my host hasn't restarted (unlikely) or this issue has largely gone away. I noticed this after i resolved another issue with KASM due to OCI runtime errors and privileged mode which was resolved after upgrading to a newer version of docker. I have been running without issues now for a couple months