Service outage 25th August 2018

mitchell · August 25, 2018, 7:27am

Hi,

Between approximately 6:30 and 7:15 on 25/8/18 there was a service outage with Backendless.

My app was unable to access the server via APIs and I was not able to access backendless console during the outage.

My loggers also reflect the outage, no timers were running during this time, which means I am now missing some data.

I’d like to know the reason for the outage and any action taken by you guys to make sure it doesn’t happen again.

Many thanks
Mitch

mark-piller · August 25, 2018, 2:11pm

Hi Mitch,

The outage was caused by a hardware failure. Due to the failure, the hypervisor stopped connectivity for the VMs running on the troubled machine. The machine was taken out of the cluster by our DevOps team.

One of the VMs on the server was critical for the overall operation of the cluster, thus impacting its availability. We will be making changes in the product to allow dynamic rerouting to any available replica.

I apologise for the inconveniences this has caused.

Regards,
Mark

mitchell · September 14, 2018, 4:30am

Hey Mark,

Thanks for the detailed reply, I appreciate your transparency.

Best,
Mitch