Monday, June 15, 2009

When Lightning Strikes

Wed June 10th, 6:30 PM PST: A lightning strike damages Power Distribution Units serving a set of racks hosting Amazon’s EC2 service.

6:30:05 PM PST: Your business transactions start failing.

7 PM PST: Your iPhone rings.

You thought that since your engineering teams were moving to "THE CLOUD," Your systems were finally going to be more reliable, more trustworthy. Finally, the much needed relief in your already over-extended workday!

But the reality is that no matter where you run your business systems, what underlying technology you use or what controls you put in place to ensure reliable business, there will always be incidents and unforeseen events that are out of your control.

Moving pieces of your application and infrastructure to a third-party hosting environment or leveraging third-party services directly within your business applications will mean even less control. A quick glance at http://status.aws.amazon.com tells you that even the best Cloud providers are only human. Service disruptions remain commonplace, no matter if they result from freak weather conditions or good old-fashioned configuration errors.

As your application evolves and as your data center turns into an amorphous cloud (no pun intended), you need to be prepared for damage control.

From a transactions standpoint, you need watch every single transaction and make sure that your iPhone rings within seconds of a disruption, not minutes or hours. In the real-time economy, every second lost equates to lost revenue.

You’ve got to be able to immediately identify which transactions failed, how many transactions failed, which consumers were affected and more, so that corrective procedures can be put into action. It will no longer suffice to simply let the business know that their transactions were disrupted!

Finally, it would help your case to negotiate strict SLAs with your service provider and establish a strategy for monitoring and documenting real-time compliance. In the event of a disruption--even if it’s too minor to be counted as a disruption by your provider--be prepared to furnish evidence and hold them accountable for the losses your business incurs.

0 comments:

Post a Comment