mirror of
https://github.com/louislam/uptime-kuma.git
synced 2026-03-02 22:57:00 -05:00
Unable to start monitor with KnexTimeoutError: Knex: Timeout acquiring a connection. #2872
Labels
No labels
A:accessibility
A:api
A:cert-expiry
A:core
A:dashboard
A:deployment
A:documentation
A:domain expiry
A:incidents
A:maintenance
A:metrics
A:monitor
A:notifications
A:reports
A:settings
A:status-page
A:ui/ux
A:user-management
Stale
ai-slop
blocked
blocked-upstream
bug
cannot-reproduce
dependencies
discussion
duplicate
feature-request
feature-request
good first issue
hacktoberfest
help
help wanted
house keeping
invalid
invalid-format
invalid-format
question
releaseblocker 🚨
security
spam
type:enhance-existing
type:new
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/uptime-kuma#2872
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @lnagel on GitHub (Dec 2, 2023).
⚠️ Please verify that this bug has NOT been raised before.
🛡️ Security Policy
Description
Startup crash with Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
👟 Reproduction steps
Start container with Uptime Kuma, log in and wait for data to load on the dashboard. Check logs for errors.
👀 Expected behavior
Dashboard would load, monitor checks would be run, log not full of errors.
😓 Actual Behavior
Dashboards do not load any data, monitor checks are not being run, log is full of errors.
🐻 Uptime-Kuma Version
1.23.7
💻 Operating System and Arch
MikroTik RouterOS 7.11.2
🌐 Browser
Firefox latest
🐋 Docker Version
No response
🟩 NodeJS Version
18.18.2
📝 Relevant log output
@chakflying commented on GitHub (Dec 2, 2023):
Considering you are using a router, it's likely you are using a USB attached storage? It's likely that the database has gotten too large that the increased IO latency is too much for the application to handle.
@lnagel commented on GitHub (Dec 2, 2023):
It was running USB attached storage, but it's a fast drive (DataTraveler Max USB 3.2 Gen 2), it wasn't even utilizing the disk too much.
I moved the container with its data folder to a 8th gen Intel NUC running the latest Docker on a dedicated NVME drive hosting the docker filesystem. The issue persists unfortunately.
Is there any advice how to reset the app's accumulated data history without losing all 89 configured monitors? I am mostly using HTTP, PING and MQTT checks.
@chakflying commented on GitHub (Dec 2, 2023):
That's very strange.
If you have worked with databases before, you can stop the application, and open the database file
data.dbin any SQLite compatible application (dbeaver, beekeeper, etc). You can then manually delete old rows in theheartbeattable.@ilogus commented on GitHub (Dec 3, 2023):
Hello, I seem to have a similar problem.
This morning, without having received any particular notification, I saw that all the monitors had a problem during the night, in the docker logs, the line
Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?is present on all monitors.I'm on a VPS running AlmaLinux 8.9 (Midnight Oncilla) with a 4.18.0-513.5.1.el8_9.x86_64 kernel.
Uptime kuma version: 1.23.8 and Docker version 24.0.7, build afdd53b
Regards,
@lnagel commented on GitHub (Dec 3, 2023):
Thanks. I did the following which reduced the size of the database from 297M down to 252K.
At least uptime-kuma will start up now, no errors in the logs so far, let's see......
Changed "Keep monitor history data" from 180 days to 7 days. Still, I installed it about 3 weeks ago at most.
@gering commented on GitHub (Dec 30, 2023):
I see a very similar error, running in docker on my Synology NAS:
@sethvoltz commented on GitHub (Feb 12, 2024):
Piling on: I have also been seeing this issue for over a year when I am moving around the app a lot (looking across monitors, editing the dashboards) but it usually self-resolves after a couple minutes of letting things catch up. Today it has not resolved and lead me to this issue.
Edit 1:
Adding current DB size stats via the backups before clearing out the heartbeat table as suggested by @lnagel
Edit 2:
The truncate was taking longer than expected so I killed it and did a
select count(*) from heartbeat;and there were still 3.23M records in the table. I have kicked it off again but that does seem awfully high for 12 monitors with 2-5m checks.Edit 3:
Yep, the truncate worked and everything about the app is snappy again. Post-
vacuumthe table is also much more reasonable:@chakflying commented on GitHub (Feb 12, 2024):
We implemented incremental_vacuum in 1.23 which should have mitigated this issue. Can you check if you are running the latest version, and if so, are there items in the logs that indicate the incremental_vacuum task has failed?
@sethvoltz commented on GitHub (Feb 12, 2024):
@chakflying just confirmed, 1.23.11. I have docker image update monitors, so I would have updated within days to a week from release.
@sethvoltz commented on GitHub (Feb 12, 2024):
I think the problem is less about
vacuumand more that the heartbeat table was huge at >3 million records for a small number of monitors.@CommanderStorm commented on GitHub (Feb 12, 2024):
What is your retention time in the settings set to?
@sethvoltz commented on GitHub (Feb 12, 2024):
@CommanderStorm Took me a few to find that as I'd never changed it. It was set to the default (I assume) of 180 days. I set it to 14 days just now to hopefully avoid this issue again
@CommanderStorm commented on GitHub (Feb 12, 2024):
Okay. I am assuming your issue is the same @lnagel
We know that lowering retention is not a good solution for the long term, but for
1.23.Xthat is everything that we can offer as a quick "remedy".A lot of performance improvements (using aggregated vs non-aggregated tables to store heartbeats, enabling users to choose mariadb as a db-backend, pagination of important events) have been made in
v2.0(our next release) resolving™️ this problem-area.=> I'm going to close this issue
You can subscribe to our releases and get notified when a new release (such as
v2.0-beta.0) gets made.See https://github.com/louislam/uptime-kuma/pull/4171 for the bugs that need addressing before that can happen.
Meanwhile (the issue is with SQLite not reading data fast enough to keep up):