Ensuring a good RTO (Recovery Time Objective)

RTO stands for Recovery Time Objective and is a measure to determine the time it takes to recover your systems from an outage and restore them to working order.

This recovery time is dependent on various factors such as:

  • The system itself

Different systems have different requirements in order to restore them.

  • How critical the system is for the business.

The RTO for your business website would be different from an intranet site for example.

  • The type of outage

The outage could be something relating to your infrastructure or software or coming from the Service Provider, or Power Company in the case of self-hosted.

For example, for self-hosted, you might require UPS and Backup Generators with dual or even triple internet connectivity. 

In the case of cloud infrastructure, you might opt to have the deployment in different geographical regions or even more drastic use different service providers.

Based on the required RTO one can pre-determine measures in advance to be able to match the target RTO. It is also important to test to be able to understand if the target RTO is even achievable.

One of the most valuable tips is to have small or manageable DNS TTL. This is important if you need to update DNS records and not have to wait long periods. In an extreme situation if you need to update DNS records and the TTL would have been set to 24 hours for example, then an RTO of 1 hour would not be achievable.

Ultimately a good monitoring solution is key to any of the above. The sooner you identify an issue or outage, the sooner you can start working on a solution and fix it.

Get a free 7-day trial of Netumo today!

Related Posts