Looping Crash After Deleting Monitor #2627

Closed
opened 2026-02-28 03:01:27 -05:00 by deekerman · 8 comments
Owner

Originally created by @jjschwarz on GitHub (Sep 24, 2023).

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

Description

I removed a single monitor from the system in the UI with the Delete button, after a few minutes of waiting I started getting WebSocket timeouts. I checked and saw the Kubernetes Pods was restarting in a loop

👟 Reproduction steps

Delete monitor and wait

👀 Expected behavior

Monitor disappears and system continues to function

😓 Actual Behavior

System crashes in repeated loop upon start

🐻 Uptime-Kuma Version

1.22.1

💻 Operating System and Arch

Docker image from docker.io/louislam/uptime-kuma

🌐 Browser

FireFox 117.0.1

🐋 Docker Version

K8S v1.26.7

🟩 NodeJS Version

16

📝 Relevant log output

==> Performing startup jobs and maintenance tasks
==> Starting application with user 0 group 0
Welcome to Uptime Kuma
Your Node.js version: 16
2023-09-24T08:37:45Z [SERVER] INFO: Welcome to Uptime Kuma
2023-09-24T08:37:45Z [SERVER] INFO: Node Env: production
2023-09-24T08:37:45Z [SERVER] INFO: Importing Node libraries
2023-09-24T08:37:45Z [SERVER] INFO: Importing 3rd-party libraries
2023-09-24T08:37:46Z [SERVER] INFO: Creating express and socket.io instance
2023-09-24T08:37:46Z [SERVER] INFO: Server Type: HTTP
2023-09-24T08:37:46Z [SERVER] INFO: Importing this project modules
2023-09-24T08:37:46Z [NOTIFICATION] INFO: Prepare Notification Providers
2023-09-24T08:37:46Z [SERVER] INFO: Version: 1.22.1
2023-09-24T08:37:46Z [DB] INFO: Data Dir: ./data/
2023-09-24T08:37:46Z [SERVER] INFO: Connecting to the Database
2023-09-24T08:37:47Z [DB] INFO: SQLite config:
[ { journal_mode: 'wal' } ]
[ { cache_size: -12000 } ]
2023-09-24T08:37:47Z [DB] INFO: SQLite Version: 3.41.1
2023-09-24T08:37:47Z [SERVER] INFO: Connected
2023-09-24T08:37:47Z [DB] INFO: Your database version: 10
2023-09-24T08:37:47Z [DB] INFO: Latest database version: 10
2023-09-24T08:37:47Z [DB] INFO: Database patch not needed
2023-09-24T08:37:47Z [DB] INFO: Database Patch 2.0 Process
2023-09-24T08:39:44Z [SERVER] INFO: Shutdown requested
2023-09-24T08:39:44Z [SERVER] INFO: Called signal: SIGTERM
2023-09-24T08:39:44Z [SERVER] INFO: Stopping all monitors
Trace: Error [ERR_SERVER_NOT_RUNNING]: Server is not running.
    at new NodeError (node:internal/errors:387:5)
    at Server.close (node:net:1804:12)
    at Object.onceWrapper (node:events:627:28)
    at Server.emit (node:events:525:35)
    at emitCloseNT (node:net:1857:8)
    at processTicksAndRejections (node:internal/process/task_queues:82:21) {
  code: 'ERR_SERVER_NOT_RUNNING'
}
    at process.<anonymous> (/app/server/server.js:1825:13)
    at process.emit (node:events:513:28)
    at emit (node:internal/process/promises:140:20)
    at processPromiseRejections (node:internal/process/promises:274:27)
    at processTicksAndRejections (node:internal/process/task_queues:97:32)
If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
2023-09-24T08:39:46Z [DB] INFO: Closing the database
Originally created by @jjschwarz on GitHub (Sep 24, 2023). ### ⚠️ Please verify that this bug has NOT been raised before. - [X] I checked and didn't find similar issue ### 🛡️ Security Policy - [X] I agree to have read this project [Security Policy](https://github.com/louislam/uptime-kuma/security/policy) ### Description I removed a single monitor from the system in the UI with the Delete button, after a few minutes of waiting I started getting WebSocket timeouts. I checked and saw the Kubernetes Pods was restarting in a loop ### 👟 Reproduction steps Delete monitor and wait ### 👀 Expected behavior Monitor disappears and system continues to function ### 😓 Actual Behavior System crashes in repeated loop upon start ### 🐻 Uptime-Kuma Version 1.22.1 ### 💻 Operating System and Arch Docker image from docker.io/louislam/uptime-kuma ### 🌐 Browser FireFox 117.0.1 ### 🐋 Docker Version K8S v1.26.7 ### 🟩 NodeJS Version 16 ### 📝 Relevant log output ```shell ==> Performing startup jobs and maintenance tasks ==> Starting application with user 0 group 0 Welcome to Uptime Kuma Your Node.js version: 16 2023-09-24T08:37:45Z [SERVER] INFO: Welcome to Uptime Kuma 2023-09-24T08:37:45Z [SERVER] INFO: Node Env: production 2023-09-24T08:37:45Z [SERVER] INFO: Importing Node libraries 2023-09-24T08:37:45Z [SERVER] INFO: Importing 3rd-party libraries 2023-09-24T08:37:46Z [SERVER] INFO: Creating express and socket.io instance 2023-09-24T08:37:46Z [SERVER] INFO: Server Type: HTTP 2023-09-24T08:37:46Z [SERVER] INFO: Importing this project modules 2023-09-24T08:37:46Z [NOTIFICATION] INFO: Prepare Notification Providers 2023-09-24T08:37:46Z [SERVER] INFO: Version: 1.22.1 2023-09-24T08:37:46Z [DB] INFO: Data Dir: ./data/ 2023-09-24T08:37:46Z [SERVER] INFO: Connecting to the Database 2023-09-24T08:37:47Z [DB] INFO: SQLite config: [ { journal_mode: 'wal' } ] [ { cache_size: -12000 } ] 2023-09-24T08:37:47Z [DB] INFO: SQLite Version: 3.41.1 2023-09-24T08:37:47Z [SERVER] INFO: Connected 2023-09-24T08:37:47Z [DB] INFO: Your database version: 10 2023-09-24T08:37:47Z [DB] INFO: Latest database version: 10 2023-09-24T08:37:47Z [DB] INFO: Database patch not needed 2023-09-24T08:37:47Z [DB] INFO: Database Patch 2.0 Process 2023-09-24T08:39:44Z [SERVER] INFO: Shutdown requested 2023-09-24T08:39:44Z [SERVER] INFO: Called signal: SIGTERM 2023-09-24T08:39:44Z [SERVER] INFO: Stopping all monitors Trace: Error [ERR_SERVER_NOT_RUNNING]: Server is not running. at new NodeError (node:internal/errors:387:5) at Server.close (node:net:1804:12) at Object.onceWrapper (node:events:627:28) at Server.emit (node:events:525:35) at emitCloseNT (node:net:1857:8) at processTicksAndRejections (node:internal/process/task_queues:82:21) { code: 'ERR_SERVER_NOT_RUNNING' } at process.<anonymous> (/app/server/server.js:1825:13) at process.emit (node:events:513:28) at emit (node:internal/process/promises:140:20) at processPromiseRejections (node:internal/process/promises:274:27) at processTicksAndRejections (node:internal/process/task_queues:97:32) If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues 2023-09-24T08:39:46Z [DB] INFO: Closing the database ```
deekerman 2026-02-28 03:01:27 -05:00
  • closed this issue
  • added the
    bug
    A:core
    labels
Author
Owner

@jjschwarz commented on GitHub (Sep 24, 2023):

Updated to v1.23.2 image and essentially the same results:

==> Performing startup jobs and maintenance tasks
==> Starting application with user 0 group 0
Welcome to Uptime Kuma
Your Node.js version: 16.20.2
2023-09-24T08:49:55Z [SERVER] INFO: Welcome to Uptime Kuma
2023-09-24T08:49:55Z [SERVER] INFO: Node Env: production
2023-09-24T08:49:55Z [SERVER] INFO: Inside Container: true
2023-09-24T08:49:55Z [SERVER] INFO: Importing Node libraries
2023-09-24T08:49:55Z [SERVER] INFO: Importing 3rd-party libraries
2023-09-24T08:49:56Z [SERVER] INFO: Creating express and socket.io instance
2023-09-24T08:49:56Z [SERVER] INFO: Server Type: HTTP
2023-09-24T08:49:56Z [SERVER] INFO: Importing this project modules
2023-09-24T08:49:56Z [NOTIFICATION] INFO: Prepare Notification Providers
2023-09-24T08:49:56Z [SERVER] INFO: Version: 1.23.2
2023-09-24T08:49:56Z [DB] INFO: Data Dir: ./data/
2023-09-24T08:49:56Z [SERVER] INFO: Connecting to the Database
2023-09-24T08:50:00Z [DB] INFO: SQLite config:
[ { journal_mode: 'wal' } ]
[ { cache_size: -12000 } ]
2023-09-24T08:50:00Z [DB] INFO: SQLite Version: 3.41.1
2023-09-24T08:50:00Z [SERVER] INFO: Connected
2023-09-24T08:50:00Z [DB] INFO: Your database version: 10
2023-09-24T08:50:00Z [DB] INFO: Latest database version: 10
2023-09-24T08:50:00Z [DB] INFO: Database patch not needed
2023-09-24T08:50:00Z [DB] INFO: Database Patch 2.0 Process
2023-09-24T08:50:00Z [DB] INFO: patch-add-invert-keyword.sql is not patched
2023-09-24T08:50:00Z [DB] INFO: patch-add-invert-keyword.sql is patching
2023-09-24T08:51:54Z [SERVER] INFO: Shutdown requested
2023-09-24T08:51:54Z [SERVER] INFO: Called signal: SIGTERM
Trace: Error [ERR_SERVER_NOT_RUNNING]: Server is not running.
at new NodeError (node:internal/errors:387:5)
at Server.close (node:net:1804:12)
at Object.onceWrapper (node:events:627:28)
at Server.emit (node:events:525:35)
at emitCloseNT (node:net:1857:8)
at processTicksAndRejections (node:internal/process/task_queues:82:21) {
code: 'ERR_SERVER_NOT_RUNNING'
}
at process. (/app/server/server.js:1886:13)
at process.emit (node:events:513:28)
at emit (node:internal/process/promises:140:20)
at processPromiseRejections (node:internal/process/promises:274:27)
at processTicksAndRejections (node:internal/process/task_queues:97:32)
If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues

@jjschwarz commented on GitHub (Sep 24, 2023): Updated to v1.23.2 image and essentially the same results: ==> Performing startup jobs and maintenance tasks ==> Starting application with user 0 group 0 Welcome to Uptime Kuma Your Node.js version: 16.20.2 2023-09-24T08:49:55Z [SERVER] INFO: Welcome to Uptime Kuma 2023-09-24T08:49:55Z [SERVER] INFO: Node Env: production 2023-09-24T08:49:55Z [SERVER] INFO: Inside Container: true 2023-09-24T08:49:55Z [SERVER] INFO: Importing Node libraries 2023-09-24T08:49:55Z [SERVER] INFO: Importing 3rd-party libraries 2023-09-24T08:49:56Z [SERVER] INFO: Creating express and socket.io instance 2023-09-24T08:49:56Z [SERVER] INFO: Server Type: HTTP 2023-09-24T08:49:56Z [SERVER] INFO: Importing this project modules 2023-09-24T08:49:56Z [NOTIFICATION] INFO: Prepare Notification Providers 2023-09-24T08:49:56Z [SERVER] INFO: Version: 1.23.2 2023-09-24T08:49:56Z [DB] INFO: Data Dir: ./data/ 2023-09-24T08:49:56Z [SERVER] INFO: Connecting to the Database 2023-09-24T08:50:00Z [DB] INFO: SQLite config: [ { journal_mode: 'wal' } ] [ { cache_size: -12000 } ] 2023-09-24T08:50:00Z [DB] INFO: SQLite Version: 3.41.1 2023-09-24T08:50:00Z [SERVER] INFO: Connected 2023-09-24T08:50:00Z [DB] INFO: Your database version: 10 2023-09-24T08:50:00Z [DB] INFO: Latest database version: 10 2023-09-24T08:50:00Z [DB] INFO: Database patch not needed 2023-09-24T08:50:00Z [DB] INFO: Database Patch 2.0 Process 2023-09-24T08:50:00Z [DB] INFO: patch-add-invert-keyword.sql is not patched 2023-09-24T08:50:00Z [DB] INFO: patch-add-invert-keyword.sql is patching 2023-09-24T08:51:54Z [SERVER] INFO: Shutdown requested 2023-09-24T08:51:54Z [SERVER] INFO: Called signal: SIGTERM Trace: Error [ERR_SERVER_NOT_RUNNING]: Server is not running. at new NodeError (node:internal/errors:387:5) at Server.close (node:net:1804:12) at Object.onceWrapper (node:events:627:28) at Server.emit (node:events:525:35) at emitCloseNT (node:net:1857:8) at processTicksAndRejections (node:internal/process/task_queues:82:21) { code: 'ERR_SERVER_NOT_RUNNING' } at process.<anonymous> (/app/server/server.js:1886:13) at process.emit (node:events:513:28) at emit (node:internal/process/promises:140:20) at processPromiseRejections (node:internal/process/promises:274:27) at processTicksAndRejections (node:internal/process/task_queues:97:32) If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
Author
Owner

@jjschwarz commented on GitHub (Sep 24, 2023):

I restored the database from the nightly backup and went into the working system. You can see in the attached logs where I delete the same monitor (25)...and then the system falls apart. After a period of time the pod fails. This is all of the log from the pod that first fails after the delete. Once that happens we are back in the loop.

How can I fix this so I can delete a monitor? For now I will revert to the last good backup and wait on suggestions.

==> Performing startup jobs and maintenance tasks
==> Starting application with user 0 group 0
Welcome to Uptime Kuma
Your Node.js version: 16.20.2
2023-09-24T09:06:49Z [SERVER] INFO: Welcome to Uptime Kuma
2023-09-24T09:06:49Z [SERVER] INFO: Node Env: production
2023-09-24T09:06:49Z [SERVER] INFO: Inside Container: true
2023-09-24T09:06:49Z [SERVER] INFO: Importing Node libraries
2023-09-24T09:06:49Z [SERVER] INFO: Importing 3rd-party libraries
2023-09-24T09:06:50Z [SERVER] INFO: Creating express and socket.io instance
2023-09-24T09:06:50Z [SERVER] INFO: Server Type: HTTP
2023-09-24T09:06:50Z [SERVER] INFO: Importing this project modules
2023-09-24T09:06:50Z [NOTIFICATION] INFO: Prepare Notification Providers
2023-09-24T09:06:50Z [SERVER] INFO: Version: 1.23.2
2023-09-24T09:06:50Z [DB] INFO: Data Dir: ./data/
2023-09-24T09:06:50Z [SERVER] INFO: Connecting to the Database
2023-09-24T09:06:50Z [DB] INFO: SQLite config:
[ { journal_mode: 'wal' } ]
[ { cache_size: -12000 } ]
2023-09-24T09:06:50Z [DB] INFO: SQLite Version: 3.41.1
2023-09-24T09:06:50Z [SERVER] INFO: Connected
2023-09-24T09:06:50Z [DB] INFO: Your database version: 10
2023-09-24T09:06:50Z [DB] INFO: Latest database version: 10
2023-09-24T09:06:50Z [DB] INFO: Database patch not needed
2023-09-24T09:06:50Z [DB] INFO: Database Patch 2.0 Process
2023-09-24T09:06:50Z [DB] INFO: patch-add-invert-keyword.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-add-invert-keyword.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-add-invert-keyword.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: patch-added-json-query.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-added-json-query.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-added-json-query.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: patch-added-kafka-producer.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-added-kafka-producer.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-added-kafka-producer.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: patch-add-certificate-expiry-status-page.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-add-certificate-expiry-status-page.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-add-certificate-expiry-status-page.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: patch-monitor-oauth-cc.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-monitor-oauth-cc.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-monitor-oauth-cc.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: patch-add-timeout-monitor.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-add-timeout-monitor.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-add-timeout-monitor.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: patch-add-gamedig-given-port.sql is not patched
2023-09-24T09:06:50Z [DB] INFO: patch-add-gamedig-given-port.sql is patching
2023-09-24T09:06:50Z [DB] INFO: patch-add-gamedig-given-port.sql was patched successfully
2023-09-24T09:06:50Z [DB] INFO: Database Patched Successfully
2023-09-24T09:06:50Z [SERVER] INFO: Load JWT secret from database.
2023-09-24T05:06:50-04:00 [SERVER] INFO: Adding route
2023-09-24T05:06:50-04:00 [SERVER] INFO: Adding socket handler
2023-09-24T05:06:50-04:00 [SERVER] INFO: Init the server
2023-09-24T05:06:50-04:00 [SERVER] INFO: Listening on 3001
2023-09-24T05:06:50-04:00 [SERVICES] INFO: Starting nscd
2023-09-24T05:07:06-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65
2023-09-24T05:07:06-04:00 [AUTH] INFO: Username from JWT: admin
2023-09-24T05:07:06-04:00 [AUTH] INFO: Successfully logged in user admin. IP=172.16.121.65
2023-09-24T05:07:06-04:00 [MONITOR] WARN: Monitor #25 'Expense System K8S Production': Failing: connect ECONNREFUSED 192.168.1.112:30981 | Interval: 60 seconds | Type: http | Down Count: 0 | Resend Interval: 0
2023-09-24T05:07:32-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65
2023-09-24T05:07:32-04:00 [AUTH] INFO: Username from JWT: admin
2023-09-24T05:07:32-04:00 [AUTH] INFO: Successfully logged in user admin. IP=172.16.121.65
2023-09-24T05:07:46-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65
2023-09-24T05:07:46-04:00 [AUTH] INFO: Username from JWT: admin
2023-09-24T05:07:46-04:00 [AUTH] INFO: Successfully logged in user admin. IP=172.16.121.65
2023-09-24T05:08:01-04:00 [MANAGE] INFO: Delete Monitor: 25 User ID: 1
2023-09-24T05:08:32-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65
2023-09-24T05:08:32-04:00 [AUTH] INFO: Username from JWT: admin
2023-09-24T05:10:04-04:00 [MONITOR] ERROR: Caught error
2023-09-24T05:10:04-04:00 [MONITOR] ERROR: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
2023-09-24T05:10:32-04:00 [AUTH] ERROR: Invalid token. IP=172.16.121.65
2023-09-24T05:10:32-04:00 [MONITOR] WARN: Monitor #73 'WebDocs K8S Production': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http
2023-09-24T05:10:32-04:00 [MONITOR] WARN: Monitor #74 'MobileIron Sentry Admin Portal': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http
2023-09-24T05:10:33-04:00 [MONITOR] WARN: Monitor #75 'MobileIron VSP': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http
2023-09-24T05:10:33-04:00 [MONITOR] WARN: Monitor #76 'Avaya CM1 Management SSH': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: port
2023-09-24T05:10:34-04:00 [MONITOR] WARN: Monitor #77 'Avaya CM2 Management SSH': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: port
2023-09-24T05:10:35-04:00 [MONITOR] WARN: Monitor #78 'Workwave API Server': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http

@jjschwarz commented on GitHub (Sep 24, 2023): I restored the database from the nightly backup and went into the working system. You can see in the attached logs where I delete the same monitor (25)...and then the system falls apart. After a period of time the pod fails. This is all of the log from the pod that first fails after the delete. Once that happens we are back in the loop. How can I fix this so I can delete a monitor? For now I will revert to the last good backup and wait on suggestions. ==> Performing startup jobs and maintenance tasks ==> Starting application with user 0 group 0 Welcome to Uptime Kuma Your Node.js version: 16.20.2 2023-09-24T09:06:49Z [SERVER] INFO: Welcome to Uptime Kuma 2023-09-24T09:06:49Z [SERVER] INFO: Node Env: production 2023-09-24T09:06:49Z [SERVER] INFO: Inside Container: true 2023-09-24T09:06:49Z [SERVER] INFO: Importing Node libraries 2023-09-24T09:06:49Z [SERVER] INFO: Importing 3rd-party libraries 2023-09-24T09:06:50Z [SERVER] INFO: Creating express and socket.io instance 2023-09-24T09:06:50Z [SERVER] INFO: Server Type: HTTP 2023-09-24T09:06:50Z [SERVER] INFO: Importing this project modules 2023-09-24T09:06:50Z [NOTIFICATION] INFO: Prepare Notification Providers 2023-09-24T09:06:50Z [SERVER] INFO: Version: 1.23.2 2023-09-24T09:06:50Z [DB] INFO: Data Dir: ./data/ 2023-09-24T09:06:50Z [SERVER] INFO: Connecting to the Database 2023-09-24T09:06:50Z [DB] INFO: SQLite config: [ { journal_mode: 'wal' } ] [ { cache_size: -12000 } ] 2023-09-24T09:06:50Z [DB] INFO: SQLite Version: 3.41.1 2023-09-24T09:06:50Z [SERVER] INFO: Connected 2023-09-24T09:06:50Z [DB] INFO: Your database version: 10 2023-09-24T09:06:50Z [DB] INFO: Latest database version: 10 2023-09-24T09:06:50Z [DB] INFO: Database patch not needed 2023-09-24T09:06:50Z [DB] INFO: Database Patch 2.0 Process 2023-09-24T09:06:50Z [DB] INFO: patch-add-invert-keyword.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-add-invert-keyword.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-add-invert-keyword.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: patch-added-json-query.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-added-json-query.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-added-json-query.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: patch-added-kafka-producer.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-added-kafka-producer.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-added-kafka-producer.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: patch-add-certificate-expiry-status-page.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-add-certificate-expiry-status-page.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-add-certificate-expiry-status-page.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: patch-monitor-oauth-cc.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-monitor-oauth-cc.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-monitor-oauth-cc.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: patch-add-timeout-monitor.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-add-timeout-monitor.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-add-timeout-monitor.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: patch-add-gamedig-given-port.sql is not patched 2023-09-24T09:06:50Z [DB] INFO: patch-add-gamedig-given-port.sql is patching 2023-09-24T09:06:50Z [DB] INFO: patch-add-gamedig-given-port.sql was patched successfully 2023-09-24T09:06:50Z [DB] INFO: Database Patched Successfully 2023-09-24T09:06:50Z [SERVER] INFO: Load JWT secret from database. 2023-09-24T05:06:50-04:00 [SERVER] INFO: Adding route 2023-09-24T05:06:50-04:00 [SERVER] INFO: Adding socket handler 2023-09-24T05:06:50-04:00 [SERVER] INFO: Init the server 2023-09-24T05:06:50-04:00 [SERVER] INFO: Listening on 3001 2023-09-24T05:06:50-04:00 [SERVICES] INFO: Starting nscd 2023-09-24T05:07:06-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65 2023-09-24T05:07:06-04:00 [AUTH] INFO: Username from JWT: admin 2023-09-24T05:07:06-04:00 [AUTH] INFO: Successfully logged in user admin. IP=172.16.121.65 2023-09-24T05:07:06-04:00 [MONITOR] WARN: Monitor #25 'Expense System K8S Production': Failing: connect ECONNREFUSED 192.168.1.112:30981 | Interval: 60 seconds | Type: http | Down Count: 0 | Resend Interval: 0 2023-09-24T05:07:32-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65 2023-09-24T05:07:32-04:00 [AUTH] INFO: Username from JWT: admin 2023-09-24T05:07:32-04:00 [AUTH] INFO: Successfully logged in user admin. IP=172.16.121.65 2023-09-24T05:07:46-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65 2023-09-24T05:07:46-04:00 [AUTH] INFO: Username from JWT: admin 2023-09-24T05:07:46-04:00 [AUTH] INFO: Successfully logged in user admin. IP=172.16.121.65 2023-09-24T05:08:01-04:00 [MANAGE] INFO: Delete Monitor: 25 User ID: 1 2023-09-24T05:08:32-04:00 [AUTH] INFO: Login by token. IP=172.16.121.65 2023-09-24T05:08:32-04:00 [AUTH] INFO: Username from JWT: admin 2023-09-24T05:10:04-04:00 [MONITOR] ERROR: Caught error 2023-09-24T05:10:04-04:00 [MONITOR] ERROR: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? 2023-09-24T05:10:32-04:00 [AUTH] ERROR: Invalid token. IP=172.16.121.65 2023-09-24T05:10:32-04:00 [MONITOR] WARN: Monitor #73 'WebDocs K8S Production': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http 2023-09-24T05:10:32-04:00 [MONITOR] WARN: Monitor #74 'MobileIron Sentry Admin Portal': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http 2023-09-24T05:10:33-04:00 [MONITOR] WARN: Monitor #75 'MobileIron VSP': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http 2023-09-24T05:10:33-04:00 [MONITOR] WARN: Monitor #76 'Avaya CM1 Management SSH': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: port 2023-09-24T05:10:34-04:00 [MONITOR] WARN: Monitor #77 'Avaya CM2 Management SSH': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: port 2023-09-24T05:10:35-04:00 [MONITOR] WARN: Monitor #78 'Workwave API Server': Pending: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: http
Author
Owner

@chakflying commented on GitHub (Sep 24, 2023):

Deleting a monitor can take a long time if the database has been in use for a while and or the retention period is long, during which the server may be unresponsive. I'm guessing your K8 config has a periodic health check, and failing the health check K8 sent SIGTERM to shutdown the server.

I think your option would be to disable the health check, delete the monitor using the UI and hope that it completes after a while,

or alternatively shutdown the server, open the SQLite database using an external tool, and delete the monitor manually from there. Remember to check that foreign_keys are turned on if you do so.

@chakflying commented on GitHub (Sep 24, 2023): Deleting a monitor can take a long time if the database has been in use for a while and or the retention period is long, during which the server may be unresponsive. I'm guessing your K8 config has a periodic health check, and failing the health check K8 sent SIGTERM to shutdown the server. I think your option would be to disable the health check, delete the monitor using the UI and hope that it completes after a while, or alternatively shutdown the server, open the SQLite database using an external tool, and delete the monitor manually from there. Remember to check that foreign_keys are turned on if you do so.
Author
Owner

@jjschwarz commented on GitHub (Sep 24, 2023):

I will try to remove the health check and test the delete. Any suggested health check limits?

Jason John Schwarz
eMail: @.***
Amateur Radio: N4JJS

On Sep 24, 2023, at 5:38 AM, Nelson Chan @.***> wrote:

Deleting a monitor can take a long time if the database has been in use for a while and or the retention period is long, during which the server may be unresponsive. I'm guessing your K8 config has a periodic health check, and failing the health check K8 sent SIGTERM to shutdown the server.

I think your option would be to disable the health check, delete the monitor using the UI and hope that it completes after a while,

or alternatively shutdown the server, open the SQLite database using an external tool, and delete the monitor manually from there. Remember to check that foreign_keys are turned on if you do so.


Reply to this email directly, view it on GitHub https://github.com/louislam/uptime-kuma/issues/3797#issuecomment-1732531750, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABIBX4LLUMZNIUDOUN3OACDX375QBANCNFSM6AAAAAA5EZJJXQ.
You are receiving this because you authored the thread.

@jjschwarz commented on GitHub (Sep 24, 2023): I will try to remove the health check and test the delete. Any suggested health check limits? Jason John Schwarz eMail: ***@***.*** Amateur Radio: N4JJS > On Sep 24, 2023, at 5:38 AM, Nelson Chan ***@***.***> wrote: > > > Deleting a monitor can take a long time if the database has been in use for a while and or the retention period is long, during which the server may be unresponsive. I'm guessing your K8 config has a periodic health check, and failing the health check K8 sent SIGTERM to shutdown the server. > > I think your option would be to disable the health check, delete the monitor using the UI and hope that it completes after a while, > > or alternatively shutdown the server, open the SQLite database using an external tool, and delete the monitor manually from there. Remember to check that foreign_keys are turned on if you do so. > > — > Reply to this email directly, view it on GitHub <https://github.com/louislam/uptime-kuma/issues/3797#issuecomment-1732531750>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABIBX4LLUMZNIUDOUN3OACDX375QBANCNFSM6AAAAAA5EZJJXQ>. > You are receiving this because you authored the thread. >
Author
Owner

@chakflying commented on GitHub (Sep 24, 2023):

I don't think anyone has done any serious benchmarking on how long it takes, so do report back if you succeed since it would be a great data point!

Of course it would be best to restore the original health check once you are done deleting monitors.

@chakflying commented on GitHub (Sep 24, 2023): I don't think anyone has done any serious benchmarking on how long it takes, so do report back if you succeed since it would be a great data point! Of course it would be best to restore the original health check once you are done deleting monitors.
Author
Owner

@louislam commented on GitHub (Sep 24, 2023):

Duplicate of #2781

And unfortunately, I think it could only be solved by using MariaDB/MySQL in upcoming 2.0.0.

@louislam commented on GitHub (Sep 24, 2023): Duplicate of #2781 And unfortunately, I think it could only be solved by using MariaDB/MySQL in upcoming 2.0.0.
Author
Owner

@jjschwarz commented on GitHub (Sep 25, 2023):

I removed the health check, but it never seems to have recovered. I waited about 15 minutes and then restarted it. This time the restart was clean, so my guess is it completed the delete but the app might have crashed in the meantime. I had 30 second health checks, so we expanded that to 90 seconds. Interesting that I stopped getting errors in the logs after 2 minutes or so, but the system was still up (non responsive).

uptime-kuma.log

@jjschwarz commented on GitHub (Sep 25, 2023): I removed the health check, but it never seems to have recovered. I waited about 15 minutes and then restarted it. This time the restart was clean, so my guess is it completed the delete but the app might have crashed in the meantime. I had 30 second health checks, so we expanded that to 90 seconds. Interesting that I stopped getting errors in the logs after 2 minutes or so, but the system was still up (non responsive). [uptime-kuma.log](https://github.com/louislam/uptime-kuma/files/12714108/uptime-kuma.log)
Author
Owner

@chakflying commented on GitHub (Sep 25, 2023):

Looking at the logs made me realize that all reads are blocked when writing. This contradicts my understanding that in SQLite WAL mode, writers should not block readers. I'm assuming it's because we are only using a single connection across the entire application.

@louislam Do you remember what was the issue with using multiple connections in the sqlite connection pool config? If there was some memory leak, is it worth investigating to see if we can work around that?

Alternatively, I think I can modify the deleteMonitor function to delete old heartbeats first over fixed size batches, until all heartbeats are cleared, then delete the monitor. This should prevent a huge cascade delete blocking for too long.

@chakflying commented on GitHub (Sep 25, 2023): Looking at the logs made me realize that all reads are blocked when writing. This contradicts my understanding that in SQLite WAL mode, writers should not block readers. I'm assuming it's because we are only using a single connection across the entire application. @louislam Do you remember what was the issue with using multiple connections in the sqlite connection pool config? If there was some memory leak, is it worth investigating to see if we can work around that? Alternatively, I think I can modify the `deleteMonitor` function to delete old heartbeats first over fixed size batches, until all heartbeats are cleared, then delete the monitor. This should prevent a huge cascade delete blocking for too long.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/uptime-kuma#2627
No description provided.