Timers that stop executing

mitchell · October 17, 2018, 11:13am

Hi team,

Just a question regarding timers. I have three deployed as part of my business logic. At times some of them will just stop executing while others continue.

For example see the attached screenshot of my log file from5 October.

As you can see at the top the three timers are all firing one after the other. But at 2018-10-15 17:17:36,109 two timers stop for no reason and the third continues.

I have only just noticed this, from checking the log files, now on 17th october. So that is two days where business logic has not been executing that I was not aware about.

Moreover when I log into console to check the timers today on 17th October, all three are turned on with the switch, so to me they should all be working. I have resolved the issue by turning all three timers off and on again using the switch in the console.

So I have a few questions:

Why do timers stop executing with no warning and still show they are active in console
Is there an automated way we can be notified that a timer is not working (separately to the notification that tells us that a particular timer run has exceeded the execution time. This would be a notification to signify that not only one execution has been terminated but actually the whole timer)
What can I do to ensure my business logic is always executing?

Best regards,
Mitch

sergey-chupov · October 17, 2018, 1:19pm

Hi Mitchell,

The timers can stop executing only in exceptional situations on our side, like storage failure or some kind of a bug in our scheduler’s code which prevents the timer to be scheduled for the next execution. We have regular validations of whether the timers are running as expected or not, and restart them automatically if there are any problems. Apparently there was a problem with this part and your timer wasn’t restarted. We’re going to revalidate the logic and take the necessary steps to prevent this from happening again in the future.
This partly answers your question regarding notifications - we don’t send any and there is no point in such notifications, since we are able to restart them on our side in case we detect such a failure. The problem only may happen with the “detection” part.

Still, in Backendless Online there are no strong guarantees that situations like this won’t happen, that’s why if you need them you’ll have to setup additional checks on your side, and there are many possible options on what these checks should be. Depending on what your timer does, you can periodically check the result of its work. Or you can have an additional timer which sends you a daily notification like “I’m still working”. There’s actually no best option here that we could advice, only the ones that fit your needs.

In Managed Backendless we probably could provide more guarantees in SLA, but that’s a topic to be discussed separately. The reason is simply that it is less prone to occasional failures or service downtimes caused by many developers using it in their own way, and also there are more possibilities to set up additional monitorings dedicated to your app only.

mitchell · November 5, 2018, 10:13am

Hi Sergey,

Thanks for the detailed information, appreciate it