What Is Server Availability?

Understanding Server Availability

Server availability is crucial for ensuring uninterrupted service. It's usually expressed as a percentage of uptime or a number of "nines," reflecting the system's reliability.

Availability (%)	Downtime (per year)
90% ('one nine')	36.53 days
99% ('two nines')	3.65 days
99.9% ('three nines')	8.77 hours
99.95% ('three and a half nines')	4.38 hours
99.99% ('four nines')	52.60 minutes
99.995% ('four and a half nines')	26.30 minutes
99.999% ('five nines')	5.26 minutes

Achieving higher availability often requires redundancy. This means investing in more infrastructure, like additional data centers, servers, and replicated databases. Balancing the cost of redundancy with the need for constant availability is key.

The goal is to provide seamless service without exceeding budgetary constraints. Finding the right balance is critical for sustainable operations.

Improving Application Availability

A single EC2 instance can create a single point of failure. Even with highly available databases and S3 storage, the application becomes inaccessible if that instance fails. Adding a second server can mitigate this risk.

Leveraging Multiple Availability Zones

The physical location of servers matters. Hardware failures, data center issues, or even Availability Zone outages can disrupt service. Deploying a second EC2 instance in a different Availability Zone addresses these physical location concerns, along with operating system and application-level issues.

Managing multiple instances introduces new challenges.

Managing Replication, Redirection, and High Availability

Creating a Replication Process

Replicating configuration files, software patches, and the application itself across instances is essential. Automation is the most efficient approach.

Addressing Customer Redirection

Clients need to know about available servers. Options include:

DNS: Using a DNS record to point to the IP addresses of all available servers. However, propagation delays can be an issue.
Load Balancers: Load balancers perform health checks and distribute load, avoiding propagation delays by acting as an intermediary between clients and servers.

Understanding High Availability Types

Choose between active-passive and active-active systems:

Active-Passive: One instance is active while the other is on standby. Ideal for stateful applications, ensuring consistent session handling.
Active-Active: Both servers are active, increasing scalability. Best suited for stateless applications where session data isn't tied to a specific server.