Push Monitor Interval Timing after Maintenance Period (V2) #4230

Open
opened 2026-02-28 03:55:32 -05:00 by deekerman · 6 comments
Owner

Originally created by @zaanposni on GitHub (Jul 30, 2025).

⚠️ Please verify that your question has not already been reported

🛡️ Security Policy

📝 Describe your problem

Note: This is in v2 but I suspect it's the same in v1

We have scheduled maintenance every week. We also use many push monitors with various intervals (most are 1-3 minutes).
After the maintenance interval ends, all HTTP monitors come back online as expected. The push monitors, however, usually go DOWN.

We suspect this is because the internal push monitor timer (when Uptime-Kuma expects a new message) is not reset or paused during maintenance. I am honest: This is counter-intuitive. We expected the configured interval to (re)start once the maintenance ends.

This screenshot shows the Dashboard Messages (top is newer, bottom is older) after maintenance (ends at 4pm). As you can see, some monitors get back online, but the push monitors do not after waiting a "random" amount of seconds. Of course, our service that pings the push monitor is not aligned with this
and might not answer within 6 seconds. It might have pinged during the maintenance window, but these messages get dropped, of course.

Image

I understand that changing something like this is very hard or maybe even impossible. That's why I opened a question issue and not a bug report. I want to understand the inner workings; maybe there is an Uptime-Kuma workaround or a workaround we have to implement on our service. Maybe our understanding of push monitors is wrong, and this is indeed intended behavior. For now, we have to disable notifications for these monitors because they produce false alarms.

📝 Error Message(s) or Log


🐻 Uptime-Kuma Version

Version: 2.0.0-beta.3-nightly-20250704091046

💻 Operating System and Arch

Ubuntu 24 LTS

🌐 Browser

Firefox 136.0

🖥️ Deployment Environment

Version: 2.0.0-beta.3-nightly-20250704091046 with internal mysql

Originally created by @zaanposni on GitHub (Jul 30, 2025). ### ⚠️ Please verify that your question has not already been reported - [x] I have searched the [existing issues](https://github.com/louislam/uptime-kuma/issues?q=is%3Aissue%20sort%3Acreated-desc%20) and found no similar reports. ### 🛡️ Security Policy - [x] I have read and agree to Uptime Kuma's [Security Policy](https://github.com/louislam/uptime-kuma/security/policy). ### 📝 Describe your problem Note: This is in v2 but I suspect it's the same in v1 We have scheduled maintenance every week. We also use many push monitors with various intervals (most are 1-3 minutes). After the maintenance interval ends, all HTTP monitors come back online as expected. The push monitors, however, usually go DOWN. We suspect this is because the internal push monitor timer (when Uptime-Kuma expects a new message) is not reset or paused during maintenance. I am honest: This is counter-intuitive. We expected the configured interval to (re)start once the maintenance ends. This screenshot shows the Dashboard Messages (top is newer, bottom is older) after maintenance (ends at 4pm). As you can see, some monitors get back online, but the push monitors do not after waiting a "random" amount of seconds. Of course, our service that pings the push monitor is not aligned with this and might not answer within 6 seconds. It might have pinged during the maintenance window, but these messages get dropped, of course. <img width="789" height="447" alt="Image" src="https://github.com/user-attachments/assets/3faa36b9-bf61-4b99-87e4-93435844cdfb" /> I understand that changing something like this is very hard or maybe even impossible. That's why I opened a question issue and not a bug report. I want to understand the inner workings; maybe there is an Uptime-Kuma workaround or a workaround we have to implement on our service. Maybe our understanding of push monitors is wrong, and this is indeed intended behavior. For now, we have to disable notifications for these monitors because they produce false alarms. ### 📝 Error Message(s) or Log ```bash session ``` ### 🐻 Uptime-Kuma Version Version: 2.0.0-beta.3-nightly-20250704091046 ### 💻 Operating System and Arch Ubuntu 24 LTS ### 🌐 Browser Firefox 136.0 ### 🖥️ Deployment Environment Version: 2.0.0-beta.3-nightly-20250704091046 with internal mysql
Author
Owner

@CommanderStorm commented on GitHub (Jul 30, 2025):

A maintenance should not change the timing.
We might drift a bit over time and we change the exact second we start the startup to not all.

We do the same work every X seconds.
Maintenance just means no notification.

What is true is that for push monitors this is unfortunately not a great match.
"I expect a push by 2s, maintenance ends in 1s and I have not received the push" will trigger a notification.
I would need to test how receiving the push affects this.

You can add retrys to get around this limitation.
I honestly don't know what other designs might be possible to get around this issue.

@CommanderStorm commented on GitHub (Jul 30, 2025): A maintenance should not change the timing. We might drift a bit over time and we change the exact second we start the startup to not all. We do the same work every X seconds. Maintenance just means no notification. What is true is that for push monitors this is unfortunately not a great match. "I expect a push by 2s, maintenance ends in 1s and I have not received the push" will trigger a notification. I would need to test how receiving the push affects this. You can add retrys to get around this limitation. I honestly don't know what other designs might be possible to get around this issue.
Author
Owner

@zaanposni commented on GitHub (Jul 30, 2025):

We do the same work every X seconds. Maintenance just means no notification.

I would like to agree and its the simplest solution. But as far as I can tell maintenance also drops/ignores incoming UP pushes, correct? So your statement is not correct right now.

Adding retries is also easy but increases the time needed for real alarms to reach us.

@zaanposni commented on GitHub (Jul 30, 2025): > We do the same work every X seconds. Maintenance just means no notification. I would like to agree and its the simplest solution. But as far as I can tell maintenance also drops/ignores incoming UP pushes, correct? So your statement is not correct right now. Adding retries is also easy but increases the time needed for real alarms to reach us.
Author
Owner

@CommanderStorm commented on GitHub (Jul 30, 2025):

What is true is that for push monitors this is unfortunately not a great match.
"I expect a push by 2s, maintenance ends in 1s and I have not received the push" will trigger a notification.
I would need to test how receiving the push affects this

As said, I would need to test.

The problem is that under the current db model maintenance is just a status. I thus can't store both up and maintenance at the same time.
Thus changing to storing ups during maintenance would be an major and quite risky undertaking given the lack of testing we have done.

@CommanderStorm commented on GitHub (Jul 30, 2025): > What is true is that for push monitors this is unfortunately not a great match. "I expect a push by 2s, maintenance ends in 1s and I have not received the push" will trigger a notification. I would need to test how receiving the push affects this As said, I would need to test. The problem is that under the current db model maintenance is just a status. I thus can't store both up and maintenance at the same time. Thus changing to storing ups during maintenance would be an major and quite risky undertaking given the lack of testing we have done.
Author
Owner

@Elypha commented on GitHub (Aug 11, 2025):

hi, I had another issue in https://github.com/louislam/uptime-kuma/issues/6053 which might be related.
I didn't mean to claim it to be a bug under a help tag, tho, thanks for the information Frank.

my concerns align with the OP

Adding retries is also easy but increases the time needed for real alarms to reach us.

for the 'time window' it sounded to me like a time span around the end of each interval, where the health check report is expected and to be checked

0 ---- 290 -- 300 -- 310 ---- 590 -- 600 -- 610 ----
       <-   window    ->      <-   window    ->

according to the discussion it seems like a strict time span 0 ~ 300 and >=1 times of report means OK. didn't find the behaviour being described in the docs so please let me know if I get it right.

if that's the case, for the workarounds, increasing the interval will lead to the effective interval being 1-2x of the configured interval, so does doing retries I think.

@Elypha commented on GitHub (Aug 11, 2025): hi, I had another issue in https://github.com/louislam/uptime-kuma/issues/6053 which might be related. I didn't mean to claim it to be a bug under a help tag, tho, thanks for the information Frank. my concerns align with the OP > Adding retries is also easy but increases the time needed for real alarms to reach us. for the 'time window' it sounded to me like a time span around the end of each interval, where the health check report is expected and to be checked ``` 0 ---- 290 -- 300 -- 310 ---- 590 -- 600 -- 610 ---- <- window -> <- window -> ``` according to the discussion it seems like a strict time span 0 ~ 300 and >=1 times of report means OK. didn't find the behaviour being described in the docs so please let me know if I get it right. if that's the case, for the workarounds, increasing the interval will lead to the effective interval being 1-2x of the configured interval, so does doing retries I think.
Author
Owner

@CommanderStorm commented on GitHub (Aug 11, 2025):

Feel free to contribute to the docs if you find those unsatisfactory. Technical writing is hard and none of us are in this field.

You can effectively make the interval be whatever by changing the base and retry options.
You say that the time window you allow is to small for you, but don't want to increase it.
I don't see how this is possible.

@CommanderStorm commented on GitHub (Aug 11, 2025): Feel free to contribute to the docs if you find those unsatisfactory. Technical writing is hard and none of us are in this field. You can effectively make the interval be whatever by changing the base and retry options. You say that the time window you allow is to small for you, but don't want to increase it. I don't see how this is possible.
Author
Owner

@Elypha commented on GitHub (Aug 11, 2025):

i'm not sure if i get how the passive mode works right so was just asking about it.

to be specific, like if the signal is fixed to be sent roughly 300 seconds (like possibly 297 or 305), for the best practice/by purpose of kuma design should i configure the interval to a larger value to make sure at least 1, possibly 2 signals are sure to be captured, or leave the interval to be exactly 300 (since it says on the ui You should call this URL every 300 seconds) and use retry & retry interval?

and, is the next interval starting after a successful/failed retry, or running on its own timeline and is separate from retries?

for the time window thing, it was because i want to know if it refers to the span only at around the end of each interval (the case in diagram) or just the whole span in between two ends.

for the docs oh it's nothing personal but just commonly saying what have been tried before ended up opening an issue let along it was never emphasised.

@Elypha commented on GitHub (Aug 11, 2025): i'm not sure if i get how the passive mode works right so was just asking about it. to be specific, like if the signal is fixed to be sent roughly 300 seconds (like possibly 297 or 305), for the best practice/by purpose of kuma design should i configure the interval to a larger value to make sure at least 1, possibly 2 signals are sure to be captured, or leave the interval to be exactly 300 (since it says on the ui `You should call this URL every 300 seconds`) and use retry & retry interval? and, is the next interval starting after a successful/failed retry, or running on its own timeline and is separate from retries? for the time window thing, it was because i want to know if it refers to the span only at around the end of each interval (the case in diagram) or just the whole span in between two ends. for the docs oh it's nothing personal but just commonly saying what have been tried before ended up opening an issue let along it was never emphasised.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/uptime-kuma#4230
No description provided.